Introduction

In 2022, primary liver cancer was the sixth most commonly diagnosed cancer and the third leading cause of cancer-related mortality worldwide, with an estimated 865,000 new cases and 758,000 deaths1. Compared to other tumors, the liver cancer is more prone to metastasis and recurrence, and its patients exhibit high heterogeneity in demographic and clinical pathological information, as well as the treatment strategies employed. Consequently, the prognosis for patients varies significantly across individual cases, presenting a substantial challenge to global healthcare systems2.

For primary liver cancer, hepatocellular carcinoma (HCC) and intrahepatic cholangiocarcinoma (ICC) are the two most common forms, while cHCC-CCA is an extremely rare type with an incidence of approximately 2–5%, characterized by the simultaneous presence of hepatocellular and cholangiocellular differentiation features3,4,5,6. Some studies suggest that the demographic and clinical characteristics of cHCC-CCA are most similar to those of cholangiocarcinoma patients, while others indicate that cHCC-CCA should be regarded as a variant of HCC with cholangiocellular features7,8. Of notes, the diagnosis and treatment of cHCC-CCA are particularly challenging due to the ambiguity of its name and histological definition, as well as the lack of consensus in staging, diagnosis, and treatment. As an aggressive tumor, cHCC-CCA has a poor prognosis. This is usually related to misdiagnoses during preoperative imaging studies, where the lesion is often classified as HCC or ICC9. Meanwhile, although many treatment options have been proposed, there is still a lack of prospective studies on cHCC-CCA, and surgery remains the primary method for treating localized cHCC-CCA at present10,11.

Currently, the American Cancer Society Tumor Staging are the most widely used staging systems for most malignant tumors12,13. In the AJCC eighth edition staging system, cHCC-CCA and ICC are classified under the same category. However, some studies have pointed out significant differences in prognosis between cHCC-CCA and ICC, indicating that cHCC-CCA should be considered a separate entity14,15. Therefore, the AJCC staging system is not the optimal staging system for cHCC-CCA. In recent years, predictive models constructed through various methods (such as random forest regression, machine learning, optimization algorithms, nomograms, etc.) have been widely applied in various fields, including medicine16,17,18,19. Among them, clinical models based on machine learning or nomograms have been widely used for survival prediction in cancer patients20. However, machine learning models typically require a large amount of data for training to improve their performance. If there is insufficient data, such as in studies of rare tumors, the models may not perform well. In contrast, nomograms, especially those that incorporate external validation, are often more favored due to their intuitiveness and interpretability21,22. At the same time, nomograms have demonstrated their predictive accuracy across various types of tumors, contributing to the advancement of personalized medicine23. Using a nomogram to construct a prognostic model for cHCC-CCA not only allows for the consideration of tumor invasion and metastasis but also incorporates other important individual clinical factors (such as gender, marital status, etc.), offering certain advantages in predicting individual survival risks.Therefore, this study utilizes a large-scale population-based database to investigate the factors influencing the prognosis of cHCC-CCA patients and construct the corresponding prognostic model.

Patients and methods

Patients

The patients diagnosed with cHCC-CCA were identified from the SEER database. The data were categorized using the International Classification of Diseases for Oncology, 3rd Edition (ICD-O-3), specifically employing a primary site code of C22.0 (liver) and a histology/behavior code of 8180/3. Inclusion criteria for data extraction encompassed patients aged 18–89 years who were diagnosed with cHCC-CCA between 2004 and 2019. The exclusion criteria included: (1) incomplete clinical information and (2) patients who died within one month of diagnosis or were lost to follow-up during the same period.

Statistical methods

All eligible cases were randomly assigned to a training cohort and a validation cohort in a 7:3 ratio. The training cohort was used for constructing the nomogram, developing the survival prediction model, and establishing a risk classification system. The validation cohort served to assess the model’s accuracy.

Thirteen factors included from the SEER database were: Age, Gender, Race, Marital, Tumor size, Grade, T stage, N stage, M stage, AJCC stage, Surgery, Radiation and Chemotherapy. Initially, univariate Cox regression analysis was used to assess the prognostic capacity of each parameter and to screen variables associated with survival outcomes (inclusion criterion: p < 0.05). Subsequently, the included variables were further examined using a backward stepwise regression method to eliminate insignificant variables, making the model more concise and reducing the risk of overfitting (removal criterion: p > 0.05). Thereafter, to assess the multicollinearity of the model, we calculated the variance inflation factor (VIF) for the variables selected in the stepwise regression. Multicollinearity refers to the high linear correlation between predictor variables, which can affect the stability and interpretability of the model. VIF helps detect and quantify this collinearity issue. When the VIF exceeds 4.0, it indicates significant multicollinearity. Therefore, variables with a VIF greater than 4.0 were excluded from the final model analysis to ensure model stability and reduce the risk of biased coefficient estimates. Ultimately, the selected independent variables were used to develop the prognostic model and risk classification system.

The nomogram was validated using C-index, AUC values and calibration curves. The C-index and AUC values range from 0.5 to 1.0, where 0.5 represents a random probability and 1.0 represents perfect fitting. Calibration curves generated from 500 bootstrap resamples were used to evaluate the calibration of the nomogram. DCA was used to quantify the net benefits of the nomogram at different threshold probabilities. The NRI and IDI were employed to evaluate the utility of the new model, all of which were compared with tumor staging based on AJCC standards. Additionally, a risk classification system was developed using the nomogram based on the total scores of each patient in the training cohort, categorizing all patients into three prognostic groups: low-, medium- and high-risk.

Data retrieval was performed using SEER*Stat software (version 8.4.3), and data analysis was conducted using R software (version 4.4.0). All p-values were evaluated with two-sided statistical tests, and differences were deemed statistically significant at a p-value threshold of less than 0.05.

Results

Patient characteristics

A total of 420 eligible patients were identified from the SEER database between 2010 and 2019 and were randomized in a 7:3 ratio into a training cohort (294 patients, 70%) and a validation cohort (126 patients, 30%). The baseline clinicopathological characteristics and treatment experiences of all patients are summarized in Table 1. The median survival time for the overall cohort was 16 months (interquartile range [IQR] 5–51 months). During the follow-up period, 335 patients (79.8%) died, of whom 277 (82.7%) succumbed to cHCC-CCA. The median age of all patients was 62 years (IQR 56–69), with 71.4% male and 72.6% white. In terms of treatment, over half of the patients underwent surgical procedures, while nearly half (44.5%) received chemotherapy; however, only a small proportion underwent radiotherapy (8.6%).

Table 1 Clinical characteristics of patients in the SEER database.

Nomogram variable screening

Univariate analysis revealed that marital, tumor size, grade, T stage, N stage, M stage, AJCC stage, surgery, radiation, and chemotherapy were associated with patient prognosis (Table 2). Factors with a p-value < 0.05 in the univariate analysis were subsequently examined using a backward stepwise regression method during the multivariate analysis (elimination criterion: p > 0.05). Ultimately, marital, tumor size, grade, AJCC stage, surgery, and chemotherapy were identified as independent predictors of CSS and were included in the prediction model (Table 2). Meanwhile, all variables included in the model had VIF values of less than 4.0, indicating that there was no multicollinearity among the selected variables. The variable selection process is detailed in Fig. 1.

Table 2 Univariate and multivariate analyses of each factor’s ability in predicting CSS.
Fig. 1
figure 1

Selection and inclusion of variables in the nomogram.

Nomogram construction and validation

The prediction model is presented as a nomogram (Fig. 2). From the nomogram, it can be seen that the points represent the scores for each variable, while the total points represent the overall score for each patient. For an unmarried patient with cHCC-CCA, if the tumor size does not exceed 5 cm, the pathological grade is I–II, the AJCC stage is II, and the patient has undergone surgical treatment but not chemotherapy, then the total score is approximately 117 points. The CSS rates for this patient at 1 year, 3 years, and 5 years are approximately 82.0%, 57.0%, and 48.0%, respectively. The C-index of the nomogram was 0.777 and 0.771 in the training and validation cohorts, respectively. In the training cohort, the AUC values at 1, 3, and 5 years were 0.858, 0.867, and 0.895, respectively. In the validation cohort, the corresponding AUC values were 0.838, 0.881, and 0.870 (Fig. 3), indicating satisfactory discriminatory capability of the nomogram. In addition, based on the AJCC staging system and the constructed nomogram, we calculated the AUC values for 10 different time points in the validation set respectively and performed a paired t-test. The results showed that the performance improvement of the new model compared to the traditional AJCC staging system model is statistically significant (t = − 16.847, p < 0.05). The calibration curve revealed that the CSS probabilities between the actual observed values and the predicted values from the nomogram at 1, 3, and 5 years demonstrated good consistency (Fig. 4). The NRI at 1, 3, and 5 years was 0.392 (95% CI 0.311–0.523), 0.425 (95% CI 0.285–0.539), and 0.414 (95% CI 0.300-0.583), respectively. The IDI at 1, 3, and 5 years was 0.165 (95% CI 0.112–0.243, P < 0.001), 0.151 (95% CI 0.091–0.225, P < 0.001), and 0.151 (95% CI 0.087–0.229, P < 0.001). These results were validated in the validation cohort (Table 3), demonstrating that the nomogram exhibited superior accuracy in predicting prognosis compared to tumor staging based on the AJCC criteria. Furthermore, the net benefit of the nomogram relative to the AJCC staging system was assessed. In both the training and validation cohorts, the DCA curves show that the nomogram performs better in predicting 1-year, 3-year, and 5-year cancer-specific survival rates. This is because, at nearly all threshold probabilities, the nomogram provides greater net benefits compared to the AJCC staging system, including both the “treat-all-patients” strategy and the “treat-no-patients” strategy. This indicates that our nomogram is a more beneficial tool for patient risk assessment (Fig. 5).

Fig. 2
figure 2

Nomogram predicting 1-, 3-, and 5-years cancer-specific survival in patients with combined hepatocellular-cholangiocarcinoma. CSS cancer-specific survival.

Fig. 3
figure 3

The receiver operating characteristic curves of the nomogram predicting cancer-specific survival for both the training and validation cohorts are presented for different time periods: (a) 1 year; (b) 3 years; (c) 5 years.

Fig. 4
figure 4

Calibration plots of the nomogram for 1-, 3- and 5-years CSS prediction of the training cohort (a) and validation cohort (b). A perfect prediction corresponds to a slope of 1, represented by a diagonal 45-degree orange line. CSS cancer-specific survival.

Table 3 NRI and IDI to evaluate the predictive power of the model.
Fig. 5
figure 5

Decision curve analysis of the nomogram and AJCC tumor staging for the cancer-specific survival prediction of combined hepatocellular-cholangiocarcinoma: (ac) 1-, 3- and 5-year CSS benefits based on the training cohorts. (df) 1-, 3- and 5-year CSS benefits based on the validation cohorts.

Risk-stratified survival analysis

Based on the total scores generated by the nomogram, we developed a risk classification system for CSS that categorizes patients were divided into low-risk group ( 0 < score < 126), medium-risk group (126 ≤ score < 223) and high-risk group (223 ≤ score ≤ 400). In the overall cohort, training set, and validation set, the median survival times for the medium-risk group were 17, 18, and 16 months, respectively. In contrast, the median survival times for the high-risk group were only 5, 6, and 5 months. The Kaplan-Meier survival curve shows significant differences among the three risk groups. (Fig. 6). Simultaneously, it is evident from the figure that this staging system’s ability to identify patients with different risk levels is significantly superior to that of the traditional AJCC staging system.

Fig. 6
figure 6

Kaplan–Meier cancer-specific survival curves of patients with combined hepatocellular-cholangiocarcinoma. (a) In the training cohort at different risks stratified according to the nomogram. (b) In the validation cohort at different risks stratified according to the nomogram. (c) In the training cohort at different stages classified according to the AJCC criteria-based tumor staging. (d) In the validation cohort at different stages classified according to the AJCC criteria-based tumor staging.

Discussion

cHCC-CCA, as a very rare tumor, has seen an increase in both incidence and mortality rates in recent years. Spolverato et al. reported that the number of diagnosed cHCC-CCA cases almost doubled from 2004 to 201524. However, there is currently limited clinical evidence regarding its prognosis. Therefore, we constructed a nomogram to predict the prognosis of cHCC-CCA patients. The validation results of the nomogram indicated that it has good discriminatory and calibration abilities. Additionally, the predictive variables included in the model can be easily obtained from clinical practice. According to a review study, the average c-index of existing models is 0.7525. In contrast, the c-index for the new model presented in this paper is 0.777 in the training cohort and 0.771 in the validation cohort, surpassing that of most previous models. In addition to the nomogram, a risk classification system was developed that effectively categorized the entire cHCC-CCA cohort into three distinct prognostic groups, thereby complementing the nomogram’s utility. Based on the regression results, we selected six variables to incorporate into the nomogram. The standard deviation measurements in the nomogram indicate that whether surgery was performed is the most important prognostic factor, followed by tumor stage and tumor size.

Several research indicates that unmarried patients exhibit more stress when diagnosed with malignant tumors, which can lead to tumor progression. Married individuals benefit from better social support and health development, resulting in a longer overall survival period. Consequently, marital status is considered an independent prognostic factor for the survival of many cancer patients. From a social psychology perspective, unmarried patients who lack a spouse are more susceptible to depression and anxiety, which can subsequently impact their willingness to undergo therapy and diminish their confidence in recovery post-treatment26,27. Therefore, the prognosis for these patients is worse than that of married patients. Our study suggests that being unmarried is an independent risk factor affecting the prognosis of cHCC-CCA patients, which may be due to the greater psychological stress stemming from the awareness of the tumor’s rarity, ultimately leading to poorer prognostic outcomes.

Currently, an increasing number of studies have found that tumor size is associated with poor survival prognosis in cHCC-CCA patients28. According to research by Tang et al., patients with a maximum tumor diameter of less than 5 cm have longer disease-free survival after surgery; meanwhile, Wang et al. found that larger tumor size (> 5 cm) is a significant risk factor for recurrence in cHCC-CCA patients one and two years post-surgery28,29,30. Additionally, tumor size is one of the important staging factors based on the eighth edition of the AJCC staging system, categorizing patients into T1a and T1b stages. Therefore, tumor size can serve as an independent prognostic predictor for cHCC-CCA.

Pathological grading is a critical factor influencing tumor treatment and recurrence. Tumor cells, originating from normal organs, exhibit a higher degree of differentiation, indicating a closer resemblance to normal cells, which correlates with lower malignancy and aggressiveness. In contrast, poorly differentiated or undifferentiated tumors reflect greater deviations from normal organs and are associated with increased levels of malignancy21. Our study found that the prognosis for patients with poorly differentiated or undifferentiated cHCC-CCA is worse than that for those with well- or moderately differentiated tumors, in line with previous findings31.

Due to the low incidence of cHCC-CCA and the lack of prospective studies, there is currently no clear treatment consensus. Therefore, the treatment of cHCC-CCA is typically inferred from the established therapies for HCC or ICC. Current research indicates that surgery and chemotherapy play an increasingly important role in extending the survival of patients with cHCC-CCA32. At the same time, our study also indicates that both surgery and chemotherapy are independent prognostic factors for cHCC-CCA. Surgical resection is currently the cornerstone and the best option for the treatment of cHCC-CCA. Research generally considers that assessing the resectability of tumors is the first step in determining the treatment for cHCC-CCA. For all patients with acceptable tumor factors and residual liver function, surgical resection should be attempted in the absence of obvious contraindications33,34. Aggressive surgical interventions, including liver resection and lymph node dissection, may improve the poor prognosis of cHCC-CCA patients and provide the longest overall survival35. However, only a few patients are eligible for surgical resection. Additionally, even after radical resection, the risk of recurrence remains high, with a median time to recurrence of 6 to 9 months36,37. For patients with tumors that cannot be surgically removed or have localized recurrence, local tumor treatment methods such as transcatheter arterial chemoembolization and radiofrequency ablation may help reduce tumor staging, create surgical opportunities, and provide survival benefits6.

There is currently some controversy regarding the survival benefits of liver transplantation (LT). A retrospective matched cohort study reported similar 5-year survival rates between cHCC-CCA patients who underwent LT and HCC controls (78% vs. 86%). Another retrospective study showed that the 5-year overall survival rate was higher for cHCC-CCA patients with concurrent cirrhosis who underwent LT compared to those who had liver resection. However, a multi-center study found no survival difference between LT and liver resection for cHCC-CCA patients who exceeded the Milan criteria38,39,40. Due to insufficient sample size, the current evidence is inadequate to recommend LT for the treatment of cHCC-CCA, and further multi-center prospective studies are necessary.

For cases that are unresectable, systemic therapy is the primary treatment option41. Although a standard treatment regimen for cHCC-CCA has not yet been established, various chemotherapeutic agents, including gemcitabine and platinum-based drugs, have been tested. Research by Trikalinos et al. found that gemcitabine-based chemotherapy regimens, particularly gemcitabine-platinum combinations, outperform sorafenib and other systemic treatment options in terms of progression-free survival42. At the same time, clinical trials regarding the combination of gemcitabine and cisplatin with atezolizumab, or atezolizumab with bevacizumab, are also underway34. However, most of the data on the efficacy of systemic drugs comes from single-center studies. Future treatment chemotherapy options still require further investigation.

Traditionally, tumor staging based on the AJCC criteria is the preferred method for predicting the prognosis of patients with malignant tumor. However, its performance in predicting individual survival risk is insufficient. We compared the nomogram we constructed, which includes demographic and clinicopathological characteristics, with the traditional AJCC staging system. NRI, IDI, and DCA indicate that this nomogram outperforms the tumor staging based solely on AJCC criteria in terms of predictive ability and clinical applicability. Using the nomogram, clinicians can quickly calculate the probability of mortality based on individual patient characteristics (such as marital status, tumor staging, etc.) and clearly explain the prognosis to the patient and their family. This personalized application helps clinicians communicate more accurately with patients. Furthermore, we divided cHCC-CCA patients into three risk categories: low, medium, and high. Kaplan–Meier curves showed significant differences in CSS among these groups. For patients with higher predicted risks according to the nomogram, clinicians may need to adopt more aggressive treatments (such as surgery or chemotherapy) or closer follow-up plans. For patients with lower risks, a more conservative follow-up strategy can be considered to reduce unnecessary medical interventions and the burden on patients.

While our nomogram demonstrates satisfactory performance, this study does have several limitations. First, many patients were excluded from the study due to unknown and incomplete data. Second, the SEER database lacks some serological indicators (such as carbohydrate antigen 199, liver function levels, etc.), which may affect the comprehensive assessment of prognostic factors. Finally, although our nomogram was developed and internally validated using a large cohort, it still necessitates multicenter clinical validations to evaluate its external applicability.

Conclusions

We developed a new nomogram and corresponding risk classification system. Compared to traditional staging systems, our nomogram demonstrates superior predictive accuracy and clinical applicability, providing clinicians with a practical and user-friendly tool for assessing the risk of cHCC-CCA patients in daily practice. At the same time, by integrating individual patient characteristics, this model will help patients and their families better understand the prognosis, allowing clinicians to pay more attention to medium- and high-risk patients, thereby aiding clinical decision-making.