Introduction

Colorectal cancer (CRC) is a major cause of cancer-related morbidity and mortality in the world1,2, affecting people’s physical health and causing serious social burden. According to global cancer statistics 2022, the number of new cases of CRC worldwide is about 1.9 million, with about 904, 000 deaths3. It accounts for nearly one tenth of cancer cases and deaths, with incidence and mortality ranking third and second respectively3.According to data from the Surveillance, Epidemiology and End Results (SEER) database (https://seer.cancer.gov/statistics-network/explorer/.) submitted in November 2023 (1975–2021), the incidence of CRC increases with age, with a median age of diagnosis of 66 and more than half of patients diagnosed between the ages of 65–84 years. It is by far one of the most frequently diagnosed malignancies in the elderly4.

Surgery is the cornerstone treatment in stage I–III CRC patients5,6. CRC patients who undergo surgery are at risk of developing new cardiovascular disease, which is associated with reduced survival7. In a large clinical study, older adults with stage I–III CRC were found to have a substantial risk of new-onset cardiovascular and cerebrovascular death (CVD)8. Older people with CRC may have an increased risk of cardiovascular death due to age-related comorbidities, anti-cancer treatments, cardiovascular toxicity, direct cancer biological mechanisms, and their common risk factors9,10,11,12. It is necessary to consider the impact of CVD on the study results when performing survival analysis of older patients with CRC.

In the past prognostic studies on clinical patients, Cox proportional risk regression model and Kaplan–Meier survival analysis methods have been widely used to predict the survival outcome of elderly patients with CRC after surgery. With respect to traditional survival analysis, all the other events are treated as censored events that may generate bias. To analyse the event with competing events, Fine-Gray competing risk regression is a suitable method13,14.

Competing risk has emerged as an important concept in the design and reporting of geriatric oncology trials15.The Fine-Gray subdistribution hazard model is also a commonly used statistical analysis method for analyzing survival data in cohort studies. Fine and Gray developed competing risk regression, which accounts for the presence of competing risk by considering the effects of predictors on the cumulative incidence function (CIF)12. At the same time that the event of interest to the subject occurs, other end events may also occur. These end events will prevent the occurrence of interested events, or reduce the probability of their occurrence, and the so-called competing risk relationship is formed among the end events, which is called the competing risk event. When there are competing risk events, the Fine-Gray subdistribution hazard model is more accurate and stable in predicting results. For elderly CRC patients, it is more reasonable to analyze the interference of CVD events to cancer-specific death events and predict postoperative survival.

The use of random survival forest (RSF) to address competing risks is completely non-parametric. This method can be used for selecting event-specific variables and for estimating the cumulative incidence function. This method is highly effective for both prediction and variable selection in high-dimensional problems and in settings that involve many competing risks16.

In this study, CVD was used as a competing risk event for death related to CRC. The Fine-Gray subdistribution hazard model and the competing risk-based RSF model for predicting postoperative cancer-specific survival (CSS) in elderly patients with stage I–III CRC were constructed based on the SEER database. It is expected to help clinicians personalized predict the probability of CSS in elderly patients with stage I–III CRC after surgery.

Materials and methods

Study population

The data for this study were selected from the SEER database established by the National Cancer Institute.We selected the database containing 17 registries, which provided data that could support the completion of this study. In this study, SEER*Stat software (version 8.4.3) was used to extract clinical data from older patients with stage I–III CRC from 2010 to 2015. In addition, data from 2018–2021 in the database is extracted as an external validation set. This study did not require approval by the ethics committee, as well as patient consent and agreement because the data was publicly available and there was no specific personal information.

Definition of elderly

Several definitions of elderly patients exist in literature with no clear and definite criteria; generally, most of the researches consider as elderly, all the patients with more than 65 years, but significantly heterogeneity exists. Moreover, the WHO has recently published new age cut-off for elderly, 75 years. The definition of elderly could not be based only on the chronological age but should be based on several factors determining the biological age. These factors are difficultly measured and no clear and objective definitions are available. For these reasons, we decided to define “elderly” as patients with more than 65 years17,18.

Inclusion and exclusion criteria

Patients with Stage IV CRC typically have markedly heterogeneous backgrounds compared to other stages19,20,21,22. This heterogeneity may confound results. Patients with stage IV CRC were excluded from the study.

The significant clinical and pathological differences between appendicular tumors and other colorectal malignancies suggest that there may also be differences in pathogenesis, although this remains to be investigated. Mainly because appendiceal tumors are relatively rare, appendiceal tumors are usually excluded from studies of prognostic variables in CRC23. We also excluded patients with appendiceal tumors in this study.

Patients who underwent surgery were identified according to “RX Summ-Surg Prim Site(1998 +) ”, patients who did not undergo surgery (code 0), and codes 90 and 99 were considered as missing information were excluded.

Inclusion criteria (i) Patients diagnosed with CRC between 2010 and 2015. (ii) Patients aged 65 years or older. (iii) Select colon and rectum (excluding appendix) by site code ICD-O-3. The international tumor code ICD-O-3 is C18.0, C18.2-C18.9, C19.9 and C20.9. (iv) Patients with pathologically confirmed CRC. (v) Patients with only one primary site. (vi) Patients with complete follow-up data.

Exclusion criteria (i) Survival time < 1 month. (ii) Clinical information is unknown (race, marital status, etc.). (iii) Patients who have not undergone primary site surgery (radical and local resection) and the status of surgery are unknown. (iv) The cause of death is unknown.

The external validation data were screened as above.The flowchart of inclusion and exclusion processes is shown in Fig. 1.

Fig. 1
figure 1

Flowchart for selection procedure of patients with stage I–III CRC.

Variable selection and outcomes

The demographic variables, such as age at diagnosis, race, sex, and marital status, were included. Tumor characteristics consist of primary site, histologic type, T stage, N stage, grade, size, carcinoembryonic antigen (CEA), tumor deposits and perineural invasion.For a more intuitive and standardized study, the study data were transformed into dichotomous or multi-categorical variables. Age was classified into three age groups:65–74 years, 75–84 years and ≥ 85 years9,24; race was classified into black, white, and others (Asian or Pacific Islander; American Indian/Alaska Native); marital status was classified into married and unmarried ; histological type was classified into two categories: adenocarcinoma and others. Primary site was categorized as colon (C18.0, C18.2-C18.9) and rectum (C19.9, C20.9); tumor size was categorized as ≤ 5 cm and > 5 cm25; grade was categorized as I, II, III and IV; T stage: T1, T2, T3, T4; N stage: N0, N1, N2. Perineural invasion was classified as yes and no and CEA was classified as positive and negative. Tumor deposits were classified as yes and no.

Then, the patients were divided into four groups according to survival status and causes of death (1)alive; (2) CVD, which included diseases of heart, cerebrovascular disease, hypertension without heart disease, atherosclerosis, aortic aneurysm and dissection, and other diseases of arteries, arterioles, and capillaries26,27; (3) CRC cancer-specific death; and (4) other events , which includes other cancer deaths and other noncancer deaths. Obtained from the SEER Cause of Death Recode 1969 + (03/01/2018) (cancer.gov).

Statistical analysis

We randomly grouped the research queue at 7:3 to form a train set and a internal test set, where the train set was used to build the model and we verified the model by internal and external test set.Pearson’s coefficient and Variance Inflation Factor (VIF) were used to evaluate the correlation and the co-linearity of the predictors28,29,30. Multicollinearity is determined to be present if the variance inflation factor is more than 5 to 1028,30. Two independent variables were considered highly correlated if their correlation value was > 0.729,31.

Fine-Gray subdistribution hazard model and nomogram

Fine-Gray subdistribution hazard models were used to analyze the data, with CRC-specific deaths as the outcome event and CVD as the competing event. First, we performed a univariate analysis, plotting CIF curves for CRC-specific deaths and CVD, and comparing differences between groups using the Fine-Gray test. Second, variables with statistical significance (p < 0.05) were screened for inclusion in the multivariate analysis. Third, the Fine-Gray subdistribution hazard model was constructed by screening statistically significant variables from the multi-factor analysis in the training set, and a nomogram of the Fine-Gray subdistribution hazard model was established to predict CSS for 1, 3, and 5 years.

Traditional Cox nomogram

We screened variables associated with CSS by univariate and multivariate Cox regression analysis and plotted the nomogram predicting CSS for 1, 3, and 5 years. The competing risk nomogram was compared with the traditional Cox nomogram under the same variables.

Random survival forests model for competing risks

RSF is a survival model based on the tree method for the analysis of right-censored survival data. To develop and validate the RSF, data were divided to learning (63% of data to develop the model) and test (37% of data to check the data validity) parts. Totally, 1000 bootstraps samples were constructed from the learning part. Then a competing risk tree for each bootstrap sample was grown. To split each node of a tree, a subset of p variables was selected randomly, and the node was split using the candidate variable that maximizes a competing risk splitting rule. The tree is grown to full size under the constraint that a terminal node should have no less than unique cases. Then we calculate cumulative incidence functions and cumulative cause-specific hazards for all events (Death of CRC, Death of CVD) for each tree. Eventually, take the average of each estimator over all trees to obtain its ensemble16. In RSF, variables can be selected by filtering on the basis of their variable importance (VIMP)16. The VIMP for x, a risk factor, is the prediction error for the original ensemble subtracted from the prediction error for the new ensemble obtained using randomizing x assignments. A large positive VIMP indicates a potentially predictive variable whereas zero or negative values identify non-predictive variables to be filtered16.

The variables selected were used to construct the RSF model and the Fine-Gray subdistribution hazard model. And the importance of the predictors was assessed. Adjust both mtry and nodesize hyperparameters using the tune.rfsrc function in randomForestSRC.After adjusting the hyperparameters, we use the optimal hyperparameters to build the final model in the training set. RSF models can also plot CIF curves for CRC-specific deaths and CVDs.

Model evaluation

The effectiveness of the Fine-Gray subdistribution hazard model and RSF model was evaluated from three aspects: accuracy, calibration and clinical benefit. Ten-fold cross validation was used to verify the model constructed by the validation set against the training set to ensure the stability of the model.Receiver operating characteristic (ROC) curves were used to estimate models discrimination by calculating the C-statistic or area under the curve (AUC).We also assessed the performance of each prediction model to discriminate outcomes on the test dataset using Harrell’s concordance index (C-index)32. The AUC and C-index ranged from 0 to 1, and the closer to 1 the more accurate the model was. We used the Brier score (BS) (mean squared distance between the predicted probabilities and actual outcomes) to predict the accuracy of each model33. BS range from 0 to 1.00, with 0 representing the best possible calibration33. Higher AUC/C-index and lower BS indicate better prediction performance. In addition, decision curve analysis (DCA) curve was used to evaluate the clinical benefit of the model to reflect whether the model could benefit patients by affecting clinical decision-making. After obtaining the threshold, the decision curve was used to determine whether the net benefit corresponding to the threshold on the None line and the All line below the threshold was higher than the net benefit value. If it is higher than that, it indicates that the model has clinical utility.

The R software (version 4.4.2; http://www.r-project.org/) was used for all statistical analyses. The “cmprsk”, “riskRegression”, and “prodlim” packages were used for competing risk analysis, the “mstate", “regplot” packages for plotting competing risk nomogram. We performed Cox regression analysis using the “survival” package and plotted the traditional Cox nomogram using the “regplot” package. The “randomForestSRC” package was used for construct the RSF model. Statistical significance was defined as a two-side p value < 0.05.

Results

Patients’ characteristics

A total of 19195 elderly patients with stage I–III CRC who underwent primary site surgery between 2010 and 2015 were included in the study. There were 10305 deaths among all patients, including 4253 deaths specific to CRC, 2571 deaths due to cardiovascular and cerebrovascular diseases, 379 deaths due to other neoplastic diseases and 3120 deaths due to other non neoplastic diseases. Baseline characteristics are shown in Table 1. The majority of patients were white in race (81.2%), and the most common histological type was adenocarcinoma (89.7%). Most patients’ tumors are moderately differentiated (72.7%). There were more patients in T3 stage (58.6%) and N0 stage (62.2%). Most of the patients had negative perineural invasion (90.3%), negative tumor deposits (90.2%). 81.3% of the patients had tumors in the colon, 18.7% of the patients had tumors in the rectum, and most of the patients had tumors < 5 cm (58.7%).

Table 1 Characteristics of the included older patients with stage I–III CRC who underwent surgery at the primary site n (%).

All the correlation coefficients between pairs of variables were < 0.7 and the VIF values were < 5, indicating no collinearity among the independent variables (Fig. 2).

Fig. 2
figure 2

(a) VIF plot. (b) Pearson’s correlation coefficients between pairs of variables.

Fine‑Gray test and Fine-Gray subdistribution hazard model

First, we used Fine-Gray test to plot the CIF (Fig. 3). The outcome indicated that age, race, marital status, primary site, T stage, N stage, grade, histological type, perineural invasion, CEA, tumor size, tumor deposits were associated with postoperative CSS in elderly patients with stage I–III CRC. The risk factors associated with CVD were age, race, marital status, primary site, tumor grade, T stage, N stage, CEA, tumor deposits and perineural invasion.

Fig. 3
figure 3

CIF curves for CRC-specific death (1) and CVD (2) by different factors: (a) age; (b) sex; (c) race; (d) marital status; (e) grade; (f) T stage; (g) N stage; (h) perineural invasion; (i) CEA; (j) tumor deposits; (k) tumor size; (l) primary site.

Second, we included univariate statistically significant variables in the multivariate competing risk analysis. The outcome indicated age, race, marital status, primary site, T stage, N stage, grade, perineural invasion, CEA and tumor deposits were independent prognostic factors of postoperative CSS in elderly patients with stage I–III CRC (Table 2).

Table 2 Multivariate competing risk analysis of postoperative CSS in elderly patients with stage I–III CRC.

Third, we used independent prognostic factors to construct the Fine-Gray subdistribution hazard model in the train set. Then evaluate the performance of the Fine-Gray subdistribution hazard model. In addition, the Fine-Gray subdistribution hazard model was visualized and the nomogram was finally plotted (Fig. 5a).

The model evaluation of the Fine-Gray subdistribution hazard model constructed based on univariate to multivariate screening is shown in Fig. 3. The Fine-Gray subdistribution hazard model we established has good discrimination power and accuracy. The 1-year, 3-year and 5-year C-index was 0.771, 0.775 and 0.759 in the train set, and 0.744, 0.762 and 0.753 in the internal test set. The 1-year, and 3-year C-index in the external validation set was 0.762 and 0.775.The 1-year, 3-year and 5-year AUC was 0.782 (95% CI 0.765, 0.798), 0.8 (95% CI 0.79, 0.811) and 0.786 (95% CI 0.776, 0.796) in the train set, and 0.754 (95% CI 0.727, 0.782), 0.786 (95% CI 0.769, 0.802) and 0.782 (95% CI 0.766, 0.797) in the internal test set (Fig. 4a, b). The 1-year and 3-year AUC was 0.77 (95% CI 0.749, 0.79) and 0.83 (95% CI 0.786, 0.82) in the external verification set (Fig. 4c). The 1-year, 3-year and 5-year BS was 0.053 (95% CI 0.050, 0.056), 0.104 (95% CI 0.101, 0.107) and 0.128 (95% CI 0.124, 0.132) in the train set, and 0.050 (95% CI0.044, 0.056), 0.106 (95% CI 0.098, 0.112) and 0.130 (95% CI 0.124, 0.136) in the internal test set (Fig. 4d, e). The 1-year and 3-year BS was 0.042 (5% CI 0.038, 0.044) and 0.085 (95% CI 0.078, 0.092) in the external verification set (Fig. 4f). Decision curve (median survival time) shows that this model leads to higher clinical benefits for patients (Fig. 4 g, h, i). The DCA indicated that when thethreshold probabilitiesranged between 10 and 40%, 15% and 70%, 20% and 75%, the use of the nomogram to predict 1-year CSS, 3-year CSS and 5-year provided greater netbenefit than the “all” or “none” strategies, which indicates the clinical usefulness of the nomogram.

Fig. 4
figure 4figure 4

(a) ROC curve and AUC for 1, 3, and 5 years of the Fine-Gray subdistribution hazard model in the train set. (b) ROC curve and AUC for 1, 3, and 5 years of the Fine-Gray subdistribution hazard model in the internal test set. (c) ROC curve and AUC for 1 and 3 years of the Fine-Gray subdistribution hazard model in the external test set (d) Birer score curve of the Fine-Gray subdistribution hazard model in the train set. (e) Birer score curve of the Fine-Gray subdistribution hazard model in the internal test set. (f) Birer score curve of the Fine-Gray subdistribution hazard model in the external test set. (g) Decision curve for 1 year of the Fine-Gray subdistribution hazard model. (h) Decision curve for 3 years of the Fine-Gray subdistribution hazard model. (i) Decision curve for 5 years of the Fine-Gray subdistribution hazard model.

Competing risk nomogram vs. traditional Cox nomogram

Multivariate Cox regression analysis was used to explore the factors affecting CSS in elderly patients with stage I–III CRC after operation (Table 3). The results showed that age, race, marital status, grade, primary site, T stage, N stage, CEA, perineural invasion, tumor deposits were independent prognostic factors of postoperative CSS in elderly patients with stage I–III CRC. We created a traditional Cox nomogram in the train set (Fig. 5b).

Table 3 Multivariate Cox analysis of postoperative CSS in elderly patients with stage I–III CRC.
Fig. 5
figure 5

(a) Competing risk nomogram to predict 1-, 3-, and 5-year CSS after surgery in elderly patients with stage I–III CRC. (b) Traditional Cox nomogram to predict 1-, 3-, and 5-year CSS after surgery in elderly patients with stage I–III CRC.

In addition, different values for each variable were taken to obtain different values of scores, and the total scores were obtained by adding all the scores of each variable, and according to the total scores, the CSS of patients at 3 and 5 years could be predicted accordingly. For example, a patient aged 65–74 years old, white, married, female, with adenocarcinoma in the rectum, grade II, T2 stage, N0 stage, negative CEA, positive tumor deposition, no perineural invasion, and tumor size less than 5 cm. The total score in the competing risk nomogram was 159; however, it was 180 in the conventional Cox nomogram (Fig. 5a, b). For the same variables, the traditional Cox model overestimates the patient’s CSS by 1, 3, or 5 years.

Variables were screened based on competing risk data and RSF model

First, RSF was used to screen variables based on competing risk data using VIMP method. As shown in Fig. 6a, the importance of variables was ranked. According to the importance of RSF variables, N stage, T stage, tumor deposits, age and CEA were the five most important variables. Composite variable importance and VIMP values, we selected ten variables: age, race, primary site, tumor grade, histological type, T stage, N stage, CEA, tumor deposition, and perineural invasion.Then, it was found that when the hyperparameter ‘ntree’ was set to be 480, ‘mtry’ was set to be 7, and ‘nodesize’ was set to be 85 (Fig. 6b). The competing risk correlation graph is shown in the Fig. 6c.

Fig. 6
figure 6

Construction of the RSF model in predicting CSS after surgery in elderly patients with stage I–III CRC in the train set. (a) Prediction error rates and the VIMP plot. (b) Hyperparameter tuning for RSF c CS-CHIF:cause-sepcific cumulative hazard fuction;CIF:cumulative incidence function;CPC:continuous probability curves.

The RSF model we established has good discrimination power and accuracy.The 1-year, 3-year, 5-year C-index was 0.801, 0.788 and 0.769 in the train set, and 0.744, 0.754 and 0.745 in the internal test set of the RSF model. The 1-year, and 3-year C-index in the external validation set was 0.761 and 0.771.The 1-year, 3-year and 5-year AUC was 0.792 (95% CI 0.776, 0.807), 0.813 (95% CI 0.802, 0.823) and 0.801 (95% CI 0.791, 0.811) in the train set and 0.749 (95% CI 0.721, 0.777), 0.779 (95% CI 0.762, 0.796) and 0.782 (95% CI 0.767, 0.798) in the internal test set (Fig. 6a, b). The 1-year and 3-year AUC was 0.767 (95% CI 0.747, 0.788) and 0.8 (95% CI 0.783, 0.817) in the external verification set (Fig. 7c). The 1-year, 3-year and 5-year BS was 0.053 (95% CI 0.51, 0.057), 0.105 (95% CI 0.102, 0.108) and 0.131 (95% CI 0.128, 0.134) in the train set, and 0.051 (95% CI0.45, 0.055), 0.109 (95% CI 0.102, 0.116) and 0.132 (95% CI 0.125, 0.140) in the internal test set (Fig. 7d, e). The 1-year and 3-year BS was 0.042 (95% CI 0.038, 0.045) and 0.086 (95% CI 0.082, 0.091) in the external verification set (Fig. 7f). Decision curve (1-year, 3-year and 5-year ) shows that this model leads to higher clinical benefits for patients (Fig. 7g, h, i). The DCA indicated that when thethreshold probabilitiesranged between 10 and 50%, 15% and 80% , 20% and 80% , the use of the nomogram to predict 1-year CSS, 3-year CSS and 5-year provided greater netbenefit than the “all” or “none” strategies, which indicates the clinical usefulness of the RSF model.

Fig. 7
figure 7figure 7

(a) ROC curve and AUC for 1, 3, and 5 years of the RSF model in the train set. (b) ROC curve and AUC for 1, 3, and 5 years of the RSF model in the internal test set. (c) ROC curve and AUC for 1 and 3 years of the RSF model in the external test set. (d) Birer score curve of the RSF model in the train set. (e) Birer score curve of the RSF model in the internal test set. (f) Birer score curve of the RSF model in the external test set. (g) Decision curve for median survival time (1 year) of the RSF model in the train set. (h) Decision curve for median survival time (3 years) of the RSF model in the train set. (i) Decision curve for median survival time (5 years) of the RSF model in the train set. (j) Brier score curves of the RSF model.

Discussion

This study highlights the critical role of competing risk adjustment in survival analyses for elderly CRC patients. The Fine-Gray subdistribution hazard model is more suitable for prognostic estimation because competing risks are taken into account.In this study, we aimed to analyze CSS in elderly patients with stage I–III CRC based on the SEER database. CVD death was considered a competing event. Although many previous studies on survival analyses for patients with stage I–III CRC have been performed, most of the previous studies analysed overall survival by treating all outcomes as one, which might not be suitable to identify CRC patients with different risks of various outcomes34,35. In previous studies, patients with competing events were included as censored data, and thus the results were biased to varying degrees.While the traditional Cox model overestimated CSS by ignoring competing events, consistent with previous studies36,37,38, our Fine-Gray subdistribution hazard model provided more accurate risk stratification. When performing a survival analysis, it is necessary to include the effect of competing risk events on the target outcome.

One study showed that high T-stage (T4), rectal high, N-stage (N2), elevated CEA, poor tumor differentiation, and age at diagnosis were all individual poor prognostic factors for postoperative cancer-specific survival in stage I–III CRC, which is consistent with our study39. Perineural invasion is a relatively new histopathological feature that is associated with poor clinical prognosis and reduced survival in various malignancies, including CRC40,41.In most studies, the reported incidence of Perineural invasion in CRC patients ranges between 9 and 33%41,42. Of the included cases, perineural invasion was present in 828 patients (19.5%) who died from CRC.Evaluation of perineural invasion after surgery is helpful to predict the prognosis of patients and guide further treatment options. Tumor deposits, considered an independent predictor of prognosis in CRC patients, provides important prognostic information in CRC patients and warrants further investigation as a unique variable in future CRC staging43. In an RCT of stage III colon cancer patients, being divorced/separated/widowed or living with another family was significantly associated with worse colon cancer recurrence and mortality, respectively, compared with being married or living with a spouse/partner44. This is also consistent with our study. At the same time, the RSF model can rank the importance of variables.N stage, T stage, tumor deposits, age and CEA were the five most important variables so that clinicians can more intuitively understand the indicators that have a greater impact on the outcome. Clinicians can manage high-risk populations by understanding the factors that affect the survival of elderly patients with stage I–III CRC.

In this study, Fine-Gray subdistribution risk model and competing risk-based RSF model were used to analyze postoperative CSS in elderly patients with stage I–III CRC, and the models showed good performance. The nomogram provides a simple and feasible clinical tool for individualized prediction of CSS in elderly patients with CRC, which can provide a basis for individualized postoperative management of elderly patients with CRC.With the advent of the medical big data era, machine learning models will be increasingly used in clinical practice to help improve the prognosis of patients.

In this study, due to the consideration of competing events, the prediction results were more accurate and stable than the traditional Cox model. It is also more reasonable to predict the postoperative survival of elderly patients with stage I–III CRC.

Limitation

There are several limitations existing in this study. First, our analysis quantifies the hazard risk of many variables associated with CSS of CRC, but does not take patients’.

comorbidities into account due to the limitations of SEER.Nevertheless, the results in this study are still meaningful and should be improved. Second, because the demographic and clinical information provided by the SEER database is not complete, more than 40, 000 individuals were excluded, which may lead to some selection bias.

Third, the SEER database itself is a high-quality population-based cancer registry, but there are still incomplete information, such as BMI, dietary habits, biomarkers, biochemical test information, and lifestyle (smoking and drinking). Radiotherapy and chemotherapy details were missing in the SEER database. Radiotherapy data are classified by the type of RT received or "no/unknown – no evidence of radiation was found in the medical records examined". Chemotherapy data are categorized as either "yes – patient had chemotherapy" or "no/unknown – no evidence of chemotherapy was found in the medical records examined". (https://seer.cancer.gov/data-software/documentation/seerstat/nov2023/treatment-limitations-nov2023.html).Although several limitations exist in this study, the analysis in the present study was still meaningful and can offer some information for clinical management.

Conclusions

Based on the SEER database, this study established the Fine-Gray subdistribution hazard model and the RSF model for postoperative survival of I–III CRC in elderly patients. The prediction performance of the models is good, and it has certain guiding significance for clinical work. The Fine-Gray subdistribution hazard model is visualized in the form of nomogram. It is more convenient and intuitive to use nomogram to predict CSS in elderly patients with stage I–III CRC. The prediction models constructed with competing risk events in mind is more suitable for elderly cancer patients.