Introduction

Locally recurrent rectal cancer (LRRC) patients (2.4–10% relapse) still face a formidable prognosis and have limited curative therapeutic choices1,2,3. However, there is no standard treatment strategy for these unable to tolerate re-operation or re-irradiation LRRC patients, so improving prognosis condition is still an imperative clinical challenge. CT-guided radioactive 125I seed implantation (RISI) has been recommended by National Comprehensive Cancer Networks (NCCN) guideline as a high-security salvage strategy4. Given the therapeutic response to RISI exhibits significant individual heterogeneity, providing personalized and accurate prediction of overall survival (OS), local control (LC) and risk stratification could enhance clinical management during follow-up and guide treatment strategies for LRRC patients. Although our previous studies have reported some conventional dosimetric risk parameters based on the univariate analysis associated with LC for LRRC patients after RISI5,6, which inadequately capture the complex interplay between tumor biology and troublesome in precise prognosis. Identifying reliable image biomarkers and constructing robust prognostic tools remain a critical barrier to personalized management for LRRC patients treated with CT-guided RISI, but no studies have been reported to date.

Higher spatial-resolution CT imaging pattern is widely used to detect pelvic recurrence and locate the tumor. Radiomics and machine learning method break the traditional medical image analysis framework by extracting and analyzing multiple quantitative texture features from radiologic images to characterize the intrinsic heterogeneity of the lesion area and molecular regulation information7,8,9. Emerging evidence highlights the growing significance of the tumor surrounding microenvironment, meaning that peritumoral radiomics features (RFs) also are promising prognostic biomarkers10,11. Moreover, based pre-trained connected neural networks (CNNs) autonomously extracted RFs have been growing achievements in the therapeutic effect monitoring and prognosis evaluation of many malignancies12,13,14. However, the prognostic potential of CT-derived handcrafted and deep learning (DL) features extracted from intra-and peri-tumoral regions for predicting LC and OS of LRRC patients after RISI remains fully unexplored in prior investigations. Our aim was to investigate if intra-and peri-tumoral radiomics combined with clinical variables can improve personalized predictions.

The Random survival forest (RSF) and the Cox hazard regression (CHR) model have demonstrated robust predictive performance in various diverse survival analysis research15,16,17. Unfortunately, the optimal predictive model whether machine learning or traditional statistical method is not yet available for LRRC patients. To address these gaps, this study first combined multiscale CT-based radiomics to rigorously develop and compare the performance of two prognostic models tailored for LRRC patients treated with RISI for optimizing surveillance and adjuvant therapy strategies.

Materials and methods

Ethics and patients

This study was conducted in accordance with the Declaration of Helsinki (as revised in 2013). The ethical protocol for this retrospective study was approved by our hospital’s Institutional Review Committee (No. IRB00006761), and the informed consent was waived. In addition, we confirmed that the whole process strictly adhered to the METRICS checklist regulations to improve the credibility, reproducibility, and transparency of the study, which was also provided in the Supplementary 1.

Initially, a total of two hundred and forty-nine patients diagnosed with LRRC treated with RISI were recruited, of which 60 were excluded due to missing CT images or incomplete clinical information. The detailed exclusion and inclusion criteria are in Fig. 1. All LRRC patients received external beam radiotherapy, chemotherapy or surgery before RISI treatment. Complete medical records of all LRRC patients were reviewed to collect the dosimetric characteristics, with their definitions fully described in Supplementary 2. Finally, 189 eligible LRRC patients were identified through standardized screening protocols at Peking University Third Hospital between December 2015 to December 2023. No additional missing data existed in the 189 analyzed LRRC patients. Among them, 145 patients underwent RISI assisted by 3D-printed non-coplanar template (3D-PNCT) technology, and the remaining 44 patients underwent traditional RISI therapy. The RISI assisted with 3D-PNCT treatment workflow strictly adhered to the procedure we mentioned6, as depicted in Fig.S1. These patients were randomly allocated to training and validation sets with a ratio of 7:3, followed by bootstrap with 1000 repetitions on the training set (N = 132) for development and comparison of the RSF and CHR models, with independent internal validation conducted on the validation set (N = 57).

Fig. 1
figure 1

The detailed exclusion and inclusion criteria.

Endpoints and follow-up

We evaluated the patient’s disease condition after receiving RISI treatment using routine blood work, biochemical testing, tumor marker analysis, abdominal CT, chest CT, and pelvic MRI. The treatment outcomes were evaluated according to the RECIST guideline (version 1.1)18. LC and OS were the target variables in the current study. Progression was defined as an increase of 20% or more in the sum of diameters of tumor lesions in available imaging follow-up. LC was defined as the duration from RISI completion to the tumor progression within the lesion. OS was calculated from the initiation of RISI to the date of death from any cause or the last follow-up. The patients were regularly checked for a LC and OS every 3 months since the time of RISI treatment. If no LC occurred, patients were right-censored after the last follow-up.

Image acquisition

All patients underwent the pre-treatment helical pelvic CT scanner 2 days before RISI to locate the tumor (Brilliance BigBore, Philips, Amsterdam, Netherlands) with the following standardized imaging protocols: tube voltage, 120 kV; tube current, 325 mA; collimation, 16 × 1.5 mm; beam pitch, 0.938; field of view (FOV), 500 mm; reconstruction slice thickness, 5 mm; rotation time, 0.75s; and matrix size, 512 × 512.

ROIs segmentation

Delineation of the recurrent tumors (intratumoral region) were manually contoured layer-by-layer by a physician (reader 1) with 10 years of clinical-diagnosing experience in Brachytherapy Treatment Planning System (B − TPS). All delineations underwent rigorous review and refinement by a senior radiologist (reader 2, > 15 years of expertise) to ensure segmentation accuracy before designing and optimizing the radiotherapy plan. Reader 1 and 2 independently contoured target volumes for 50 randomly selected cases to evaluate inter-observer consistency. At the same time, reader 1 repeated contouring for the same 50 patients after a one-week to assess intra-observer reproducibility.

To determine the optimal peritumoral range, we employed morphological dilation via Python’s SimpleITK library to isotropically expand the GTV by 1–6 mm peritumoral regions of interest (ROIs) for comparative analysis. This process yielded seven distinct sets of ROIs per patient, including the original GTV and six peritumoral ROIs at 1 mm intervals. Given that tumor recurrence frequently involves sacral invasion and rectum, we removed the contours containing air and bone tissue with Hounsfield unit (HU) threshold of below − 200 and above + 400 to optimize peritumoral expansion delineation. The study process flow is depicted in Fig. 2.

Fig. 2
figure 2

The study process flow. RFs: radiomics features; DL: deep learning.

Image processing and features extraction

Preprocessing procedures were implemented for CT image of each patient resampling and normalization prior to feature extraction. Because of the variability in-plane resolution, the CT voxels were interpolated to 1 × 1 × 1 mm3 using linear interpolation. Subsequently, the gray values were uniformly normalized to [0,1] using min-max normalization.The results were prepared for handcrafted and deep learning features extraction.

Intratumoral and peritumor radiomics features

The detailed settings regarding the CT image preprocessing and handcrafted RFs extraction are provided in Table S1. A total of 1874 quantitative intra- and peri-tumoral handcrafted RFs were initially derived from the original CT images and 7 image preprocessing filters using open-source Pyradiomics package for per patient. The whole features extraction process adhering to the Image Biomarker Standardization Initiative (IBSI) protocols19,20.

Deep learning features

We implemented the ResNet and Densenet as representational feature extractors, which were fine-tuned via transfer learning using the TensorFlow framework (v2.13.0) on an NVIDlA GeForce GT 730 graphics processing units. The architecture of fully ResNet-50 (DL1), ResNet-101(DL2), Densenet121 (DL3) and Densenet201 (DL4) models are presented in Fig. S2. These four commonly used CNNs were pre-trained on the large-scale, well-annotated ImageNet database21. 7 consecutive two-dimensional CT slices with the largest lesion area on the axial plane as the center layer were selected and resized to 224*224 pixel as the input size. Owing to the limited amount of data, we performed data augmentation on each training image as mentioned above, including random translation and rotation. We trained the network by dividing LRRC patients into two labels with the median LC and OS time as the cut off value and utilized sigmoid crossentropy as the loss function on training set (n = 132), while an independent validation set monitored performance metrics to trigger early stopping at 200 epochs. we chose the Adam optimization algorithm with an initial learning rate to 0.0003, utilizing a batch size of 32 and random state of 42 for DL models. For each patient, a total of 2048 DL1 and DL2 features, 1024 DL3 1920 DL4 were extracted from the global averaging pooling layer, respectively. Guided Gradient-weighted Class Activation Mapping (Guided Grad-CAM) visualizes the CNN output in the last convolutional layer to understand the interpretability of the four types of DL models, as Fig. S3.

Feature selection

The Z-score normalization was adopted to transform grey values of different magnitudes into a unified measure before following steps. The repeatability and robust of handcrafted and deep learning RFs were quantitatively evaluate by classical intraclass correlation coefficient (ICC). The good agreement for segmentation were considered stable if the ICC was above 0.80 and input into the process of subsequent feature selection. Due to the high dimensionality of the features, univariate Cox regression analysis was first performed to screen possible high prognosis related RFs (p < 0.05). Subsequently, pearson correlation validation was used to eliminate the correlation of the selected features, only one feature from pairs with a coefficient above 0.9 was retained. To further avoid overfitting and enhance model generalizability, the LASSO-COX also incorporating 5-fold cross-validation was subsequently applied, eliminating redundant RFs by shrinking coefficients of non-predictive variables to zero, and a maximum of 10 features were retained in each radiomic model. The corresponding radscore (RS) models based on the most predictive handcrafted and deep learning RFs derived from intratumoral, 6 peritumoral ROIs region, and 4 DL models with nonzero coefficients to predict LC and OS.

Model construction and statistical analysis

The significant variables of clinical and dosimetric (p < 0.05) that were strongly associated with LC and OS were screen out by univariate and multivariate Cox regression analysis. Then, these selected characteristics were employed to construct RSF and CHR prediction model. First, the tune_grid method was implemented to confirm the optimal combination of parameters for RSF model. Next, we run RSF and CHR model bootstrap resampling 1000 iterations to increase the robustness on the training set. Finally, we fairly evaluated the prognostic performance of the proposed RSF and CHR models by the discordance index (C-index), integrated Brier score (IBS) and time-dependent area under the curve (tAUC) to choose the excellent prognosis model in the validation set. The 95% confidence intervals were reported for each metric using bootstrap resampling. The details of each package of R software version v.4.3.1 were described in Supplementary 3, two-sided p value below the 0.05 were considered statistical significance.

Ethics approval

The study protocol was approved by the Ethics Committee of Peking University Third Hospital [IRB00006761].

Results

Patient characteristics

Detailed information about baseline clinical characteristics and dosimetric parameters of LRRC patients in the training and validation sets were listed in Table 1. These variables demonstrated no significant difference distributions between the two cohorts (all p > 0.05). The median (95%CI) LC and OS time for 189 patients in this study were 15.0 (12.5–18.0) months and 19.4 (17.6–21.3) months, respectively. The one- and two-year LC rates were 58.3% (95%CI 51.5–66.2) and 29.5% (95%CI 21.8–39.1), respectively. Corresponding one- and two-year OS rates were 75.6% (95%CI 69.7–82.0) and 37.5% (95%CI 35.2–52.8), respectively. During the follow-up period, a total of 108 (57.1%) LRRC patients were confirmed progressive disease in the treated region and 134 (70.9%) participants dead at the end of the follow-up. Death prior to local progression constitutes a competing risk that may bias LC estimates. In our study (n = 189), no deaths occurred without prior documented local progression. This nullifies competing risks concerns for LC endpoints.

Table 1 Patient characteristics in the training and validation sets.

Result of feature selection

The results of feature selection conducted a sequential combination of ICC, univariate analysis, and LASSO-Cox selection on the from the training data as shown in Fig. 3A. Following feature exclusion based on the intra- and inter-observer ICC > 0.80, 1245 handcrafted, 1089 DL1, 1072 DL2, 978 DL3 and 1062 DL4 RFs exhibiting high reproducible were ultimately retained for subsequent analytical workflows. The details of selected features by the univariate Cox regression analysis and Lasso-COX model were showed in the Supplementary 4. Finally, 10 of the most useful intratumoral handcrafted RFs were retained for LC prediction, and 9 features were used for OS prediction. Peri-tumoral handcrafted RFs and DL features associated with LC and OS as illustrated in Fig. 3B.

Fig. 3
figure 3

The results of feature selection. A Methodology of feature selection. B Results of feature selection: the number of selected features in feature selection pipeline.

Performance of radiomics signatures

The performance of all radscores for predicting LC and OS were evaluated in both training and validation cohorts, as detailed in Table 2. Notably, while the RSHad attained the highest C-index for LC prediction (0.72, 95%CI 0.63–0.81) in the validation set. The results indicated that the radscore generated by 1-mm peritumoral expansion (RSPeri1mm) demonstrated the optimal LC prediction, achieving a C-index of 0.70 (95%CI 0.60–0.77) in the validation set. For the prespecified secondary endpoint OS, the radscore based on 4-mm expansion (RSPeri4mm) showed the highest performance of C-index 0.64 (95%CI 0.53–0.74). Comparative analysis of different peritumoral radscores was confirmed by the DeLong test (all p < 0.05). Therefore, RSPeri1mm and RSPeri4mm were retained as the representative peritumoral radscores for LC and OS prediction, respectively. Additionally, the RSDL4 exhibited impressive C-index of 0.67 (95%CI 0.57–0.77) for LC, and RSDL2 with 0.63 (95%CI 0.54–0.70) for OS in the validation set. The Kaplan-Meier curves showed based on RSHad, RSPeri1mm, RSDL4 for LC analysis (Fig. 4a, b, c, p < 0.05) and RSHad, RSPeri4mm, RSDL2 for OS evaluation (Fig. 4a1, b1, c1, p < 0.05) were able to stratify patients with a significant risk stratification in the validation set. All the radscore models showed better predictive performance than the clinical model in predicting LC. RSHad and RSDL features is presented in Table S2. RSPeri1mm and RSPeri4mm handcrafted radiomic features along with their corresponding coefficients, respectively selected by LASSO_COX analysis was listed in Table S3.

Table 2 Predictive performance (C-index) of intratumoral, different peritumoral, and deep learning radscores for LC and OS prediction of the LRRC patients in the training and validation sets. Peri peritumoral, RS radscore, C-index concordance index, CI confidence interval.
Fig. 4
figure 4

LC analysis (a–c) stratified by risk classification according to RSHad, RSPeri1mm, RSDL4, and OS analysis (a1–c1) based on the RSHad, RSPeri4mm, RSDL2 for patients in the validation cohorts. The log-rank validation was employed to statistically compare the LC and OS distributions depicted in the Kaplan-Meier curves.

Clinical prognosis factors

The results of multivariate Cox regression in the training set indicated that 3D-PNCT (HR = 0.47, 95%CI0.26–0.87, p = 0.016), D90 (HR = 0.99, 95%CI 0.99-1.000, p = 0.001), V100 (HR = 0.94, 95%CI 0.90–0.97, p = 0.001) were significantly associated with LC. DM (HR = 1.84, 95%CI 1.27–2.67, p = 0.001), T stage (HR = 0.36, 95%CI 0.21–0.60, p < 0.001), chemotherapy (HR = 0.04, 95%CI 0.01–0.19, p = 0.033), radiotherapy (HR = 2.05, 95%CI 1.11–3.76, p = 0.020), D90 (HR = 1.00, 95%CI 1.00–1.00, p < 0.001), V100 (HR = 0.95, 95%CI 0.91–0.99, p = 0.016) were prognostic predictors associated with the survival outcomes (Table S4). The Kaplan–Meier analysis of the above-mentioned independent clinical and dosimetric factors in predicting LC and OS for LRRC patients is shown in Fig. S4 and Fig. S5, respectively (all p < 0.05). The clinical model had a C-index of 0.66, 0,67 for predicting LC and OS in the validation set, respectively.

Performance and risk stratification of the RSF and CHR model

Figure S6 visualized a pairwise spearman correlation matrix between independent clinical factors and selected radscores for predicting LC (A) and OS (B), all p value > 0.05, for LC range: -0.38 to 0.74; for OS range: -0.28 to 0.74. It can be concluded that the correlation-based feature selection procedure successfully mitigated feature redundancy. We further integrated identical risk predictors into the two models to explore their predictive capability for LC and OS. The RSF model was adopted to run 1000 times with n_tree = 1000, and nodesize = 4 to predict LC, n_tree = 1500 and nodesize = 14 to predict OS. Table 3 showed that the IBS, C-index and 95%CI of RSF and CHR models in the training and validation sets. The traditional CHR model (IBS: 0.16, 95%CI 0.15–0.17, C-index: 0.72, 95%CI 0.63–0.81) demonstrated significantly inferior predictive performance for predicting LC than RSF model (IBS: 0.13, 95%CI 0.12–0.13, p < 0.001; C-index: 0.76, 95%CI 0.64–0.84, p < 0.01), respectively in the validation set. Similarly, the RSF model (IBS: 0.11, 95%CI 0.10–0.12; C-index: 0.75, 95%CI 0.75–0.77) exhibited superior predictive accuracy and greater generalization capability relative to the CHR model (IBS: 0.17, 0.13–0.20; C-index: 0.69, 95%CI 0.60–0.77) for OS prediction.

Table 3 The prognostic performance of RSF and CHR models integrating same selected radscores and clinical features in predicting LC and OS for LRRC patients after RISI in the training and validation sets. 1Comparison with the performance of CHR model to RSF model in the same datasets. Abbreviations: CHR, Cox hazards regression; RSF, random survival forests; CI, confidence interval.

Furthermore, LRRC patients could be stratified into low-and high-risk groups by the median predicted values of RSF and CHR model (LC threshold:19.15; OS threshold: 51.80) in the training and validation sets. The RSF predicted values (training: p < 0.001; validation: p < 0.001, Fig. 5a, b) demonstrated significant stratification efficacy for LC compared to the CHR model (training: p = 0.009; validation set: p = 0.18, Fig. 5c, d). The RSF (p = 0.026, Fig. 5f) could provide more accurate OS prediction and remarkable prognostic stratification than the CHR model (p = 0.32, Fig. 5h) for LRRC patients in the validation set. The high-risk patients displayed significantly worse LC and OS than those in the low-risk group. Compared with the risk stratification performance of single radscore model for LC evaluation, the prediction values derived from the RSF model achieved more statistically significant outcomes. Figure 6A and B demonstrated LRRC patients stratification using the RSF model’s median predicted value for LC. The high-risk cohort exhibited significantly higher progression rates at 1-year (76.1% vs. 1.5%) and 2-year (100.0% vs. 35.9%) compared to the low-risk group (p < 0.001). Non-progressed at 1-and 2-year patients demonstrated markedly lower predictive scores than those with tumor progression, with a mean difference of 26.05 (95%CI 25.68–28.81; p < 0.001) and 27.76 (95%CI 23.71–32.89; p < 0.001). Additionally, the LRRC patients who survived 1- year (mean RSF predicted value: 44.73 vs. 93.29, p < 0.001, Fig. 6C) and 2-year (mean RSF predicted value: 33.98 vs. 72.75, p < 0.001, Fig. 6D) had a significantly lower predicted value than those who died.

Fig. 5
figure 5

Kaplan Meier plots of LC and OS for LRRC patients with different risks stratified based on the RSF and CHR model in the train and validation sets.

Fig. 6
figure 6

Individual RSF model predicted value stratified by two risk groups and heatmap of selected radscores for LRRC patients. Patients were stratified into high- and low-risk groups based on the RSF model-predicted risk scores. The upper row (A-B) display bar charts illustrating the 1- and 2-year local progression status across two risk groups, while C-D show corresponding 1- and 2-year survival status distributions. The lower panel presents a heatmap of standardized radscore, where rows represent individual predictive radscores, columns correspond to patients (stratified by risk group), and color intensity reflects feature value magnitudes (warmer hues indicating higher values, cooler hues lower values).

Moreover, the tAUC value at 1 and 2 years also confirmed the better prognostic discriminative ability of the RSF model in predicting LC and OS consistently exceeding those of the CHR model in training and validation set (p < 0.05). The tAUC of the RSF model for 1 and 2-year LC prediction were 0.840, 95%CI 0.758–0.928 and 0.888, 95%CI 0.860–0.997, respectively, for 1 and 2-year OS prediction (0.835, 95%CI 0.801–0.952 and 0.761, 95%CI0.667–0.918, respectively) in the validation set, as shown in Table 4. Finally, the predicted LC and OS by the RSF model showed great agreement with the observed survival (Fig. S7). Decision curve analysis (DCA) of the RSF model also had a better overall net benefit across most threshold probabilities compared with CHR model (Fig. S8).

Table 4 The discrimination performance of RSF and CHR model for predicting LC and OS in the training and test set. tAUCs of RSF and CHR model were calculated and compared in the two sets. CHR Cox proportional Hazards regression, tAUC time-dependent area under the curve, RSF random survival forests, CI confidence interval. * p < 0.05, the difference reach statistically significance between RSF and CHR models in the validation set.

Discussion

Accurate prognostic assessment is critical for enabling clinicians to tailor timely and individualized therapeutic strategies for LRRC patients after RISI treatment. Lu et al.6 first reported 66 patients with LRRC treated by CT-guided RISI that only D90 > 130 Gy or D100 > 55 Gy or V100 > 90% can significantly prolong the LC time, none of the dosimetric parameters had an effect on OS. However, their utility in constructing precise prognostic models is inherently limited by small cohort sizes and univariate analysis. To date, no prognostic model has been established for this population, mainly due to the relapsing population that suitable for RISI is a small group, resulting in limited availability of eligible cohorts for robust model development. To address these limitations, we first develop and validate machine learning-driven multiscale radiomics prognostic model with a relatively large sample size to optimize clinical decision-making for LRRC patients after RISI treatment.

Radiomics encompass more comprehensive and valuable prognostic information related to tumor surrounding environment and are increasingly recognized in predicting the prognosis of locally advanced rectal cancer22,23, but it has not been explored in the field of LRRC. In contrast to conventional handcrafted intratumoral RFs, pre-trained DL models have been extensively employed in automated RFs extraction, capitalizing on their transfer learning capabilities to overcome data scarcity challenges prevalent in medical imaging research. Xiao et al. confirmed that CT-based DL features based on pre-trained ResNet-50 significantly enhanced the prognostic performance in predicting the OS among small-cell lung cancer patients14. Gong et al. also demonstrated that deep learning signature using the 3D-Densenet architecture showed better prognostic performance than radiomic signature for predicting local recurrence-free survival24. Therefore, we adopted transfer learning to extract DL features from four pre-trained Resnet-50/101 and Densenet121/201 modules to improve the prediction performance. The median LC and OS time represent a clinically actionable threshold where outcomes meaningfully diverge (clinically interpretable risk stratification), which significantly affects prognosis. Division at the median ensures balanced class distribution. This mitigates model bias toward majority classes and enhances training stability in limited training dataset (n = 132). The binary cross-entropy loss function was employed to measure the discrepancy between the model’s output and the label. This dichotomization has been validated in some deep learning radiomics studies14,26.

The peritumoral subclinical regions are recognized as biologically informative niches, containing critical biomarkers such as localized inflammatory activity, microinvasive tumor fronts, and stromal remodeling factors, which demonstrated in other cancer types15,25. Furthermore, we explored the impact of peritumoral expansion margin by selecting ROIs at 1–6 mm around the tumor based on previous studies to determine the optimal region25,26,27,28, and compared the predictive performance of radscores constructed from different peritumoral regions for LC and OS prediction. Gu et al. demonstrated that the 4 mm peritumoral expansion radscore yielded the highest prediction for recurrence-free survival, with a C-index of 0.74 (95%CI 0.69–0.79)26. Li et al. demonstrated that the combined radscore incorporating intratumoral and peritumoral 3 mm had the best predictive capacity, with a C-index of 0.800 (95%CI 0.681–0.920)27. Pérez‑Morales et al. developed a predictive model integrating peritumoral and intratumoral CT-based radiomic signatures to estimate OS and PFS in lung cancer patients28. These finding underscore that precise selection of region-specific ROIs is critical, and synergistic integration of multimodal methodologies significantly enhances prognostic predictive performance. Our findings also demonstrated that RSperi4mm provided the best predictive accuracy for LC, as this 1-mm expansion corresponding to the clinical target volume might preserve more tumor-related biological information. Moreover, the RSperi4mm demonstrated superior predictive performance compared to intratumoral radscore. Our analysis revealed that peri-tumoral expansions contains distinct prognostic information and biological heterogeneity.

Machine learning enables precise prognostic prediction by interpreting complex patterns within high-dimensional data29. Traditional CHR model requires data to meet the proportional hazards assumption, but RSF model overcomes the limitations and has shown excellent prediction performance. Dong et al. find that RSF model showed excellent performance in predicting overall survival and stratification than the Cox regression model for patients after lung transplantation, but they only relied on clinical characteristics for modeling without considering other potential factors17. Nevertheless, the machine learning model have not yet been explored in LRRC patients receiving RISI treatment.

In this study, we first confirmed that the prognostic performance of multiscale radiomics based RSF model demonstrated better performance over the conventional CHR model based on the C-index, IBS, tAUC, calibration curve, and DCA curve in the training and internal validation in predicting LC and OS. Our study demonstrated that the RSF model can consistently provide more accurate progressive disease and survival predictions at specific time points, including 1 and 2 year, confirming its robust prognostic value and broadening its potential applications. This approach provides a more comprehensive perspective of tumor biology, thereby enhancing the prognostic predictive performance of the model. This performance is primarily attributable to the RSF model’s capability to capture complex nonlinear interactions between the selected predictors and outcomes consistent with its application in other heterogeneous tumors. In addition, the predictive value based on RSF could achieve accurate risk classification of LC and OS than CHR model for LRRC patients, which has important significance for determining the population with better efficacy.

There were some limitations merit consideration in this study. The development of our predictive model employed robust internal validation, but our study lacks external validation on independent cohorts. The main limitation lies in the fact that the limited datasets availability for LRRC patients receiving this specialized treatment and data-sharing restrictions in multicenter studies (only a few centers possessing adequate experience). More prospective and multicenter studies are needed to explore the relevant issues and mechanisms to further confirm the generalizability of our models. Second, the tumor delineation for the DL model was based on the largest 2D slice, which may not adequately represent the entire tumor’s spatial characteristics. Future investigations should prioritize 3D analysis to characterize full spatial tumor features. We also acknowledge this approach discards time-to-event information, potentially limiting predictive precision. we will explore alternatives that respect survival timing (e.g.: we can select the Cox partial log-likelihood as the loss function to train the LC and OS signatures building network in the large sample training cohort in the future). Finally, we will further explore potential dosiomics and MRI biomarkers with more prognostic information to further improve the prediction performance of LRRC patients suitable for RISI therapy.

Conclusions

In conclusion, this study investigated the prognosis predictive value of intratumoral, peritumoral and deep learning features for LC and OS prediction. The RSF model incorporating multiscale radscores and clinical features may outperform the traditional CHR model in LC and OS prediction and risk stratification. Despite some limitations, this study first introduces the RSF model integrating clinical variables, intra- and peri-tumoral radiomics to prognostic prediction of LRRC patients after RISI, and the proposed method may have significant potential for future applications in clinical practice.