Introduction

Cervical cancer is one of the most common cancers affecting women worldwide. According to the World Health Organization (WHO), approximately 604,000 new cases of cervical cancer were reported in 2020, resulting in approximately 342,000 deaths globally1. More concerningly, an increasing incidence has been observed in younger age groups2. For patients with advanced cervical cancer (aCC), concurrent chemoradiotherapy and brachytherapy are the preferred treatments, with acceptable efficacy for most patients. However, about 30–40% of patients, particularly those with advanced disease, experience relapse during treatment, posing a significant threat to survival3,4. According to Surveillance, Epidemiology, and End Results (SEER) data from 2011 to 2017, while the 5-year survival rate for localized cervical cancer was 91.9%, rates for regional and distant disease declined to 58.2% and 17.6%, respectively. Advances in genomics, proteomics, and high-throughput sequencing have enabled precise genetic diagnosis, targeted therapy5, and immunotherapy6, revolutionizing cancer diagnosis and treatment. In particular, combining radiotherapy and chemotherapy with immune checkpoint inhibitors is expected to bring transformative changes to the management of recurrent or metastatic cervical cancer7.

Several clinicopathological features, including tumor size, Federation of Gynecology and Obstetrics (FIGO) stage, lymph node metastasis, pathological type, and squamous cell carcinoma antigen (SCC-Ag) level, influence the progression of cervical cancer and serve as independent prognostic predictors8. Given the limitations of these features, imaging plays a crucial role in clinical diagnosis and treatment due to its convenience, non-invasiveness, speed, and cost-effectiveness. For advanced stages, CT is the recommended imaging modality9.

Radiomics is an innovative approach to image analysis that enables high-throughput extraction of quantitative features from standard medical images10, including MRI11, CT12, and PET/CT13. Over the past decade, advances in computational power and memory have facilitated the development and implementation of state-of-the-art artificial intelligence (AI) techniques for radiological image analysis. Encouragingly, radiomics has been widely applied in oncology, including pancreatic14 and colorectal cancers15. Deep learning (DL) techniques, which automatically learn representative information from raw image data to decode tumor radiological phenotypes, show great promise in disease diagnosis, prognosis assessment, and treatment response prediction16.

Recent studies have explored the predictive role of radiomics and DL in cervical cancer, including lymph node metastasis (LNM)17, treatment response18, recurrence19, and survival20. However, the accuracy and clinical applicability of survival prediction in aCC require improvement. Conventional radiomics workflows often rely on manual feature extraction from regions of interest, which can be operator-dependent, time-consuming, and may miss prognostically relevant sub-visual image patterns21,22,23. Therefore, there is a need to develop novel radiomic approaches to construct predictive models for clinical decision-making and prognosis assessment in aCC.

To address these limitations, we developed a novel deep learning-based prognostic model. Unlike previous approaches, our model employs a Vision Transformer (ViT) backbone to automatically extract comprehensive high-level spatial features from baseline CT scans, overcoming the subjectivity of hand-crafted features. Furthermore, we integrated a Recurrent Neural Network (RNN) to analyze contextual relationships across the depth dimension and fuse all information for holistic prediction. This deep learning-derived radiomic signature (Rad-score) was combined with key clinical variables in a multivariate Cox model to create a robust and clinically applicable nomogram. We hypothesize that this integrated approach will provide a more accurate and reliable tool for individualized survival prediction in patients with advanced cervical cancer, ultimately aiding clinical decision-making.

Materials and methods

Patients

This retrospective study included patients with aCC diagnosed by biopsy between July 1, 2014, and August 31, 2019, from Qilu Hospital of Shandong University. All patients met the following strict inclusion and exclusion criteria. Inclusion criteria: (1) biopsy-confirmed primary cervical cancer; (2) aCC diagnosed by physical examination and auxiliary inspection interpreted by experienced gynecologists (FIGO stage ≥ IIB); (3) pelvic contrast-enhanced CT scans performed within 2 weeks before treatment initiation; (4) age ≥ 18 years; (5) initial treatment with concurrent chemoradiotherapy (CCRT). Exclusion criteria: (1) insufficient clinical or follow-up data; (2) poor-quality CT images; (3) concurrent cervical diseases such as cervical myoma or cysts; (4) history of other malignancies; (5) incomplete standard treatment. Ultimately, 396 out of 468 patients with suitable CT images were included. The study was approved by the Qilu Hospital of Shandong University Ethical Review Committee (KYLL-202203-030). As this was a retrospective study, waived informed consent was approved by Qilu Hospital of Shandong University Ethical Review Committee, and all procedures complied with the Declaration of Helsinki and its subsequent amendments. Eligible patients were randomly divided into training (N1 = 198) and validation (N2 = 198) cohorts in a 1:1 ratio. The patient screening flowchart is shown in Fig. 1.

Fig. 1
Fig. 1
Full size image

The flow chart of patient screening in this study.

Clinical characteristics

We collected patient demographics, clinical features, and pathological results for retrospective analysis, including age at diagnosis, presence of chronic diseases (e.g., hypertension, diabetes, coronary heart disease), family history of cancer, tumor stage (FIGO 2018), tumor diameter, pathological type, and LNM. All data were obtained from medical records.

Acquisition and preprocessing of CT images

All patients underwent pelvic contrast-enhanced CT scans after diagnosis and before treatment. Scans were performed using a Siemens Somatom Definition AS multidetector CT scanner (Siemens Healthineers, Forchheim, Germany) with the following parameters: tube voltage, 120 kVp; tube current, automatic; reconstruction thickness, 5 mm; reconstruction interval, 5 mm; voxel size, 0.68 mm × 0.68 mm × 5 mm. Enhanced arterial phase CT scans were used. Ultraviolet 300 (Bayer, Germany) was administered as a contrast agent at a dose of weight (kg) × 1.2 mL and an injection rate of 3 mL/sec. Scanning commenced 30 s after injection. Board-certified radiologists with 20 years of experience selected three representative images from each patient’s CT scans: the slice with the largest cross-sectional area and one slice proximal and one distal to it along the craniocaudal axis to capture tumor heterogeneity. ImageJ software (http://imagej.nih.gov/ij, Java 1.8.0_172, 64-bit) was used to delineate a rectangular region of interest (ROI) encompassing the entire tumor (Supplementary Fig. 1). A total of 1188 tumor images were obtained for DL model training and validation.

Treatment and follow-up

Patients received standardized individualized treatment according to National Comprehensive Cancer Network (NCCN) guidelines. For aCC, primary treatment consisted of CCRT, including pelvic external beam radiotherapy (EBRT), cisplatin-based chemotherapy, and brachytherapy. Follow-up was conducted every 3 months for the first 2 years and biannually for the next 3–5 years. Assessments included radiotherapy and chemotherapy side effects, recurrence, metastasis, and survival status. The study endpoints were 3-year and 5-year overall survival (from diagnosis to event or censoring).

Deep learning

We developed a DL model based on ViT and RNN using pre-treatment CT images and survival data from the training cohort. The workflow is illustrated in Fig. 2. The model comprises ViT for feature extraction and RNN for sequence modeling. The ViT component includes a linear embedding layer, transformer encoder blocks, and a feature learning layer. ROIs from selected CT images were cropped and resized to 384 × 384 pixels. ViT served as a per-frame feature extractor, encoding each 2D slice into a feature vector rich in global information. The sequence of feature vectors was input into the RNN, which analyzed contextual relationships across depth dimensions and fused the information to generate a Rad-score indicating individual mortality risk. The model was constructed and validated using the randomly divided training and validation cohorts. Cross-entropy was used for backpropagation, and hyperparameters were optimized via grid search. Key hyperparameters for ViT: patch size (16), hidden size (768), number of transformer layers (12), number of attention heads (12). For RNN: input size (768), hidden size (256), number of layers (2), dropout (0.1), bidirectional (True).

Fig. 2
Fig. 2
Full size image

The workflow of this study. For each patient, after the input of 3 tumor images, the DL model based on ViT (learning features) and RNN (integrating the feature representation for each patient and reporting the final image score) outputs a Rad-score, representing the survival probability of them. A novel combined model was then established and C-index, ROC curve, calibration curves, and clinical decision curves were used to evaluate the model performance. Finally, a nomogram was created and used to visualize the model.

Statistical analysis

Analysis of the baseline data

Clinical characteristics of the training cohort and the validation cohort were compared using the Chi-Squared Test for categorical data: Age, FIGO stage, Chronic complications, Tumor size, Histopathology, LNM, Treatment, Hemoglobin (HB), Monocytes (M), Lactate dehydrogenase (LDH). The WilCoxon Rank Sum Test for continuous data (median follow-up time) after Normality Test. All hypothesis tests were two-sided, and P < 0.05 indicated a significant difference.

Analysis of the survival data

Kaplan-Meier survival analysis and Log-rank algorithm were used to compare the differences in survival time between groups. χ2 test was used to compare the differences of 3-year and 5-year survival rates between groups. All hypothesis tests were two-sided, and P < 0.05 was considered to indicate a significant difference.

Development and validation of the clinical prediction model

STROBE guidelines were strictly followed in this retrospective observational study. Univariate and multivariate Cox proportional hazards regression analyses were used to determine the independent risk factors associated with the survival of aCC. The results were expressed as Hazard Ratio (HR), 95% Confidence Interval (95%CI) and P value. Variables with P < 0.05 in univariate Cox regression analysis were further included in multivariate Cox regression analysis, and prediction models were established according to the forward regression method, backward regression method and stepwise regression method, respectively. Finally, the goodness of fit of the model was measured according to Akaike information criterio (AIC). The method with the lowest AIC value was selected to determine the final prediction model.

By Cox regression analysis, the independent risk factors were used to construct the clinical model including clinical and pathological characteristics, the radiomics model including Rad-score, and the integrative model including both clinical, pathological characteristics and Rad-score respectively. The time-dependent C-index curve, Receiver Operating Characteristic (ROC) curve, Calibration Curve and Decision Curve Analysis (DCA) were drawn to evaluate the performance of the model.

Software support

R software (version 4.2.2) and SPSS (R26.0.0.0) was used for all analyses. The implementation of the DL model is based on Pyvips (vision 1.12.3), OpenCV (vision 4.1.1), Python (vision 3.6.6), PyTorch (vision 1.2.0), Scikit-learn (vision 0.21.2) and Matplotlib (vision 2.2.2).

Results

Characteristics of patients

Of a total of 468 patients with pathologically confirmed aCC between July 2014 and August 2019, 396 patients (median age, 52; interquartile range [IQR], 46–60) were eventually included in the study after applying the inclusion and exclusion criteria. They were randomly divided into a training cohort (N1 = 198) and a verification cohort (N2 = 198) in a 1:1 ratio. The specific characteristics of the two groups are presented in Table 1. The 5-year survival outcome was death in 72 of 198 women (36.4%) in the training cohort and in 62 of 198 women (31.3%) in the validation cohort. The median follow-up time was 51.4 months (49.9–55.8 months) in the training cohort and 50.5 months (48.8–52.9 months) in the validation cohort. Kaplan-Meier survival curves and cumulative risk curves for patients in each cohort are shown in Supplementary Fig. 2. There was no significant difference between the training and validation cohorts (P > 0.05).

Table 1 Characteristics of the patients at baseline.

Clinical predictors of survival

We performed Kaplan-Meier survival analysis of multiple variables including clinical, pathological, and hematologic characteristics of patients with aCC in the training cohort, as shown in Supplementary Fig. 3–4. We found that age (P = 0.043), FIGO stage (P < 0.001), histological type (P = 0.018) and LNM (P < 0.001) were risk factors for the survival of aCC. The prognosis of patients with age > 65, advanced stage, adenocarcinoma and LNM is poor. Subsequent univariate and multivariate Cox regression analysis yielded consistent conclusions, as shown in Table 2.

Table 2 Univariate and multivariable analyses between clinical features and Rad-score in the training cohort.

Rad-score related to the survival of aCC patients

The workflow of the DL prediction model is shown above (Fig. 2). We ended up with 1188 contrast-enhanced CT images (384 × 384 patches) including the tumor area of 396 patients. A small Rad-score implies a relatively low risk of disease progression and therefore a longer survival time for this patient; a high Rad-score means a high risk of disease progression and a relatively short survival time. The Youden’s Index was maximized at a Rad-score cutoff of 0.70. Consequently, patients with a Rad-score ≥ 0.70 were classified as the high-risk group, while those with a Rad-score < 0.70 were classified as the low-risk group. Then performed Kaplan-Meier analysis to validate the prognostic value of the Rad-score. As shown in Supplementary Fig. 5, it is glaringly obvious that higher Rad-score were significantly associated with poorer survival outcomes in the training and validation cohorts, the survival status of patients in the high-risk group and the low-risk group is significantly different (both with P < 0.001). In addition, univariate COX regression analysis was conducted for Rad-score, and the result are shown in Table 2. We found that Rad-score were an independent predictor for survival in patients with aCC (HR = 5.29[3.24–8.65], P < 0.001). As shown in Fig. 3, two aCC patients with similar clinical and pathological characteristics were observed. The OS of patients with a Rad-score of 0.98 was significantly shorter than that of patients with a Rad-score of 0.42. Radiographic features associated with Rad-score are shown in Supplementary Fig. 6.

Fig. 3
Fig. 3
Full size image

The Rad-score and survival outcomes of two patients with similar clinical and pathological characteristics.

Development of the survival prediction model of aCC

Schoenfeld individual tests (Supplementary Fig. 7a) indicated that the Cox model fitted the proportional hazard assumption requirement (p > 0.05). As shown in Supplementary Fig. 7b, there was no strong correlation between clinical, pathological variables and Rad-score. We further carried out multivariate Cox regression analysis on clinical, pathological variables and Rad-score, and finally determined that Age (P < 0.001), FIGO stage (P < 0.05), Histological type (P = 0.029), LNM (P = 0.002) and Rad-score (P < 0.001) could be used to comprehensively predict the survival of patients with aCC. The results of multivariate COX regression analysis are shown in Table 2; Fig. 4.

Fig. 4
Fig. 4
Full size image

The forest plot of multivariate COX regression analysis of clinical, pathological characteristics and Rad-score.

As shown in Table 3, in the training cohort, the clinical model showed acceptable survival prediction performance (C-index: 0.686, 95%CI: 0.625–0.747). In the validation cohort, the performance of the clinical model was further validated (C-index: 0.632, 95%CI: 0.569–0.695). The C-index of the Rad-score model based on radiomics and DL was 0.723(0.658–0.788) in the validation cohort, respectively, which was significantly better than that of the clinical model (P < 0.001). To further evaluate whether Rad-score could improve the performance of the clinical model, we constructed an integrative model for survival prediction using clinical, pathological characteristics and Rad-score. The C-index of the integrative model was significantly higher than that of the clinical model and the radiomics model in both the training cohort (0.784, 95%CI: 0.733–0.835) and the validation cohort (0.726, 95%CI: 0.677–0.785) (P < 0.001).

Table 3 C-index of clinical, radiomics and integrative model in training and validation cohorts.

Validation of the survival prediction model of aCC

Supplementary Fig. 8a showed that the C-index of the integrative model didn’t change significantly with survival time, showing the stability of the performance. The ROC curve showed that compared with the clinical model and the radiomics model (Supplementary Fig. 9), the sensitivity and specificity of the integrative model were more clinically superior (Supplementary Fig. 8b-c). In addition, we plotted the calibration curve (Supplementary Fig. 8d-e) and the DCA curve (Supplementary Fig. 8f-g) of the integrative model, and it was evident that the integrative model had significant application value. Therefore, the above results indicated that the CT-based Rad-score was important for improving the predictive performance of the clinical survival model for patients with aCC. In other words, the integrative model based on clinical, pathological characteristics and Rad-score had extremely promising application in predicting patients’ survival.

Visualization and risk stratification of the survival prediction model of aCC

The Nomogram was drawn to visualize the integrative prediction model (Fig. 5a). In clinical practice, clinical workers could take into account the patient’s age at diagnosis (≤ 65 years, > 65 years), FIGO stage (stage IIB, III, IV), histological type (squamous cell carcinoma, adenocarcinoma), LNM (positive, negative), Rad-score, which helped to obtain the final 3-year and 5-year survival probabilities.

Fig. 5
Fig. 5
Full size image

Visualization and risk stratification of the survival prediction model for aCC patients. (a) The integrative Nomogram prediction model. (b-c) The distribution of patients and the incidence of death events in 3 and 5 years with different risk stratification in the validation cohort. (d-e) The Kaplan-Meier analysis in 3 and 5 years with different risk stratification in the validation cohort.

We then classified all the patients into three strata of low, intermediate, and high risk on the basis of the upper and lower quartiles of the probability of death at 3 and 5 years from the integrative prediction model in the training cohort. The 3-year mortality risk stratification bases were 0.124 and 0.346, and the 5-year mortality risk stratification bases were 0.184 and 0.4787. To validate the clinical value of risk stratification, the 3-year and 5-year distributions of the risk populations and the rates of death events were statistically described and compared in the validation cohort (Fig. 5b-c). For patients with aCC, the 3-year mortality rates of high-risk, intermediate-risk, and low-risk groups were 55.4%, 25.3%, and 10.4%, respectively (P < 0.001). And the 5-year mortality rates were 57.1%, 28.6%, and 6%, respectively (P < 0.001).

In addition, Kaplan-Meier survival curves were constructed according to risk stratification for overall survival at 3 years and 5 years (Fig. 5d-e). In the validation cohort, compared with the low-risk group, the Hazard ratio (HR) of death within 3 years in the medium-risk and the high-risk groups were 2.703 (95%CI, 1.203–6.073; P = 0.016) and 6.470 (95%CI, 2.976–14.067; P < 0.001). The HRs for death within 5 years were 5.407 (95%CI, 1.636–17.866, P = 0.006) and 13.680 (95%CI, 4.189–44.676; P < 0.001).

Discussion

This study describes the survival status of aCC patients, with 3-year and 5-year mortality rates of 29.0% and 33.6%, respectively. Cox regression identified age, FIGO stage, histological type, and LNM as independent risk factors. We developed a DL model combining ViT and RNN to extract radiomic features from CT images, which served as a novel marker to improve survival prediction in aCC. The integrative model, incorporating clinical, pathological characteristics and Rad-score, demonstrated high accuracy and discriminative ability, with C-indexes of 0.784 and 0.726 in the training and validation cohorts, respectively24. The combination of radiomics and DL algorithms enables automatic extraction of complex features from CT images, quantifies disease risk, and facilitates timely identification of high-risk patients. The integrative model and risk stratification have broad clinical applicability, supporting individualized treatment and medical decision-making for aCC patients. Gynecologists can adjust treatment plans under personalized guidance, intensify therapy, or explore new approaches such as immunotherapy and targeted therapy.

Radiomics, which converts medical images into mineable high-throughput quantitative features, has garnered increasing attention25. While biopsies capture heterogeneity only within a small tumor region, radiomics assesses heterogeneity across the entire tumor volume. DL algorithms further enhance this by automatically learning complex features and providing quantitative evaluations26. Importantly, imaging features derived from DL correlate significantly with clinicopathological characteristics27. Traditional CNNs, though widely used in medical image analysis, focus primarily on local features and may overlook global context. Peritumoral imaging features reflect biological characteristics related to intra-tumoral heterogeneity28; thus, excluding peritumoral regions may result in information loss. In this study, experienced radiologists delineated rectangular ROIs encompassing the entire tumor and surrounding areas. ViT, with its self-attention mechanism, excels at capturing global information through cross-patch interactions29. Our team has previously demonstrated the utility of ViT in ovarian cancer. This study is the first to apply ViT and RNN to predict survival in aCC patients, focusing on global information aggregation. We extracted Rad-scores from pre-treatment contrast-enhanced CT images and combined them with clinical features to construct a nomogram, improving prediction accuracy.

Several limitations should be considered. First, the retrospective design may introduce selection bias despite the relatively large sample size. Second, the single-center study and use of CT images from a single manufacturer limit generalizability. Multi-center external validation is needed before clinical integration. Additionally, manual image selection and segmentation are time-consuming, highlighting the need for AI-based automatic lesion detection and segmentation to enable automated prediction post-CT acquisition.

Future work will involve prospective multi-center validation with larger samples to strengthen clinical evidence. Incorporating multimodal biological predictors, including pathomic and genomic data, may further enhance the predictive power of the integrated model.