Introduction

Hepatocellular carcinoma (HCC) remains one of the prevalent malignant tumors worldwide. Due to the hidden early symptoms, it usually has developed into the middle or advanced stage when diagnosed, precluding effective treatments and resulting in poor prognosis. Consequently, systemic therapy constitutes a primary treatment strategy for advanced HCC1.

According to Barcelona Clinic Liver Cancer (BCLC) staging system, patients in advanced stage (BCLC-C) present with intrahepatic vascular invasion, extrahepatic metastasis, or mild tumor-related symptoms, as reflected by Eastern Cooperative Oncology Group performance status (ECOG PS) score of 1–22. Further subclassification revealed that patients with PS1 but not vascular invasion or extrahepatic metastasis, also defined as PS1 alone, survived better; therefore, some studies recommended that these patients could be classified as intermediate stage3,4,5. In the Hong Kong Liver Cancer (HKLC) staging system, patients with PS0 and PS1 are not distinguished when staging and even allocated with the same treatment recommendations6. Additionally, several studies still enrolled patients with PS1 alone in the “intermediate stage” because of the vague staging definition7,8,9,10. Given above, the advanced stage of HCC comprises a highly heterogeneous patients population and a further patient stratification is highly needed.

Generally, transarterial chemoembolization (TACE) is recommended as the first-line treatment option to patients with intermediate stage of HCC2; however, it is usually used beyond guidelines in clinical practice, especially in advanced HCC patients with PS1 alone11,12,13,14. Furthermore, previous studies have showed that well-selected advanced HCC patients have achieved outcomes of 28.0–39.8 months following TACE, better than general advanced HCC patients of 6.4–19.2 months undergoing systemic therapy11,15,16,17,18,19,20,21. Consequently, indiscriminately classifying HCC patients with PS1 alone as advanced stage might deprive them of the opportunity for more aggressive treatment including TACE. A risk stratification especially for patients with PS1 alone is warranted to identify who might benefit from TACE.

Based on the prognostic factors related to TACE, Hepatoma Arterial-embolization Prognostic (HAP) scoring system has been proposed to predict treatment outcomes of TACE and identify potential candidates22. For improving the predictive abilities, a series of modified HAP scoring models have been developed and externally validated (Fig. 1)23,24,25,26,27,28. Using these models, this study aims to explore the heterogeneity of advanced HCC patients who are classified as “advanced” solely due to PS1 instead of vascular invasion or metastasis, identify the candidates who could benefit from TACE treatment and be immigrated to the intermediate stage.

Fig. 1
figure 1

HAP series scoring system for risk stratification.

Results

Patient characteristics

Among the 745 eligible HCC patients included (Fig. 2), 645 patients were male (86.6%). Hepatitis B virus (HBV) infection isthe most common etiology (89.0%). In total, there were 663 patients (89.0%) with HBV infection. Among them, 477 patients were receiving antiviral treatment at the time of treatment; 151 patients were not receiving antiviral treatment; data were missing for the remaining 35 individuals. In addition, 616 (82.7%) patients had a Child–Pugh score of 5, and 436 (58.5%) had single tumor. Concerning tumor-related symptoms evaluated by ECOG, 450 (60.4%) patients were with PS0 and 295 (39.6%) with PS1 (Table 1). Moreover, patients with PS1 had lower levels of albumin (ALB), higher levels of aspartate aminotransferase (AST) and platelet (PLT), and a larger tumor burden (tumor number and size) than those with PS0 (all P < 0.05).

Fig. 2
figure 2

Flowchart of the patient selection process. HCC, hepatocellular carcinoma; TACE, transarterial chemoembolization; PS, performance status.

Table 1 Baseline characteristics and chemotherapeutic regimen of TACE for the study cohort.

Prognostic ability of PS score

During a median follow-up of 18.4 months, the median overall survival (OS) reached 24.3 (95% CI 21.5–27.1) months in the whole cohort. Patients with PS0 had significantly longer OS (29.8 months, 95% CI 27.2–32.4) than patients with PS1 alone (16.4 months, 95% CI 12.9–19.9), with 1-, 2-, and 3-year survival rates of 77.5%, 41.6%, and 19.6% in PS0 group vs. 61.5%, 29.3%, and 18.3% in PS1 group (log-rank P < 0.001) (Fig. 3A). Univariate analysis demonstrated that alpha fetoprotein (AFP), tumor number, tumor size, ALB, total bilirubin (TBIL), AST and PS were prognostic factors (all P < 0.05) (Table 2). Moreover, these variables above were included in a multivariate analysis, and AFP (adjusted HR = 1.46, P < 0.001), tumor number (adjusted HR = 1.41, P < 0.001), tumor size (adjusted HR = 1.07, P < 0.001), ALB (adjusted HR = 0.98, P = 0.021), TBIL (adjusted HR = 1.00, P = 0.045), and PS (adjusted HR = 0.81, P = 0.035) were significantly correlated with OS. Considering the confounding effects of the prognostic factors mentioned above, these variables were included in propensity score matching (PSM) to evaluate the predictive ability of PS score; there were 295 pairs of patients whose propensity scores were balanced in PS0 and PS1groups (1:1 matching). In the new cohort, survival analysis also showed better OS in patients with PS0 than those with PS1 alone (1-, 2-, and 3-year survival rates of 75.5%, 36.9%, 12.9% in PS0 group vs. 61.5%, 29.3%, and 18.3% in PS1 group; log-rank P = 0.010) (Fig. 3B).

Fig. 3
figure 3

Survival comparison for patients with PS0 and PS1 for the whole cohort (A) and PSM cohort (B). PS, performance status; PSM, propensity score matching.

Table 2 Cox regression analysis for OS in the whole cohort.

Validation of the HAP models in risk stratification in this cohort.

The HAP score and its variants have been established to predict outcomes and guide treatment selection in patients with HCC undergoing TACE, in which higher scores were related with worse outcomes22,23,24,25,26,27. According to these models, the patients in this cohort were classified into four risk groups (grade A, B, C and D). In the mHAP-III model, the median and interquartile scores (0.18, 0.28 and 0.78) were used as cut-off values for risk stratification (Table 3). Survival analyses showed that the median OS gradually decreased as the grade escalated from A to D in all the four scoring systems (log-rank P < 0.001). Using grade A as reference, risks of death for patients in B, C and D were significantly increased according to the three modified HAP scoring models (all P < 0.05). Although there was no difference of OS between patients in group A and B of HAP (P = 0.154), those in group C and D survived worse than patients in group A (both P < 0.001). Therefore, this series of HAP scoring models validated well in patients treated with TACE in this dataset.

Table 3 Performance of the HAP model and its variants in predicting OS.

Prognostic values of PS among different risk stratifications

Considering that PS1 was correlated with higher tumor burden and worse liver function, it was supposed that PS might not be an independent prognostic factor for patients in low-risk stratification. The prognostic abilities of PS were separately evaluated among different definitions of low-risk stratification by cascading merger of grade A to grade A + B, and then to grade A + B + C (Supplementary table 1). Interestingly, there was no significance about the prognostic values of PS for low-risk stratification, but it was gradually appeared as the enlargement of patient group in all models. Moreover, multivariate analyses showed that PS lost its prognostic values in OS for patients of grade A and A + B, who were in grade A of HAP (adjusted HR = 0.67, P = 0.176), mHAP (adjusted HR = 0.79, P = 0.286), mHAP-II (adjusted HR = 0.95, P = 0.918) and mHAP-III (adjusted HR = 0.83, P = 0.831), as well as in grade A + B of HAP (adjusted HR = 0.92, P = 0.585), mHAP (adjusted HR = 0.85, P = 0.202), mHAP-II (adjusted HR = 0.88, P = 0.483) and mHAP-III (adjusted HR = 0.79, P = 0.129). However, when patient group was enlarged to grade A + B + C, the prognostic abilities of PS occurred in the scoring systems of HAP (adjusted HR = 0.76, P = 0.010), mHAP (adjusted HR = 0.78, P = 0.012), mHAP-II (adjusted HR = 0.75, P = 0.038) and mHAP-III (adjusted HR = 0.75, P = 0.013) (Table 4). In addition, these findings were confirmed by the adjustment of propensity score in different risk stratifications (Table 4).

Table 4 Multivariate analyses for PS in predicting OS in different definitions of candidates.

Survival analysis of the identified candidates for TACE

As PS lost its prognostic value in low-risk patients with grade A or B based on all HAP scoring models, the patients with PS1 alone were able to achieve comparable OS to their counterparts with PS0. Thus, patients with grade A or B were defined as potential candidates for TACE and should be immigrated from BCLC stage of C to B. Instead, patients with grade C or D based on all HAP-based models were defined as non-candidates for TACE. For those candidates, Kaplan–Meier curves demonstrated that patients with PS1 alone survived better than those with PS0 in most of HAP-based scoring models (log-rank P < 0.05 in HAP, mHAP and mHAP-III models, and log-rank P = 0.095 in mHAP-II model, Fig. 4A, C, E and G). However, after PSM adjustment, there was no significant difference in OS between potential candidates with PS1 alone and those with PS0 in all these HAP-based models (all log-rank P > 0.05, Fig. 4B, D, F and H).

Fig. 4
figure 4

Prognostic abilities of PS for the whole potential candidates and PSM cohort according to different scoring system of HAP (A and B), mHAP (C and D), mHAP-II (E and F) and mHAP-III (G and H). PS, performance status; PSM, propensity score matching.

Discussion

In this study, we observed that patients with PS1 had higher levels of tumor burden and AST, but lower levels of ALB than those with PS0, which was in accordance with prior reports indicating PS was associated with tumoral and cirrhotic factors in HCC patients29. Moreover, survival analyses herein demonstrated that PS1 was an independent negative predictor of OS in the whole cohort. These findings seemed to support previous studies and the BCLC staging system that assigned patients with PS1 alone to the advanced stage2,19,21. However, accumulating data suggest that there was a huge heterogeneity in patients of BCLC-C stage and the prognostic ability of PS may be influenced by baseline characteristics of the patients. In the HKLC staging system, patients with PS0 and PS1 were regarded at the same stage and assigned to similar therapy6. Other studies further advocated that HCC patients with PS1 alone should receive more aggressive therapy, instead of strictly following the guidelines in clinical practice7,9,14,30,31. Herein, with the aid of a series of HAP scoring models, we found that the prognostic value of PS varied in different risk grades of patients with PS1, further supporting the heterogeneity of this population. Thus, it is necessary to stratify HCC patients with PS1 alone for guiding treatment selection.

Several studies have confirmed that a proportion of patients with advanced HCC could benefit from TACE, but patients with PS1 alone has been less noticed3,13,20,32. In this study, we focused on this population and risk-stratified them via HAP scoring models in an attempt to identify those who would benefit from TACE. Notably, the significance of PS in predicting clinical outcomes disappeared in low-risk patients (grade A and B) in all models. Further survival analysis showed these patients achieved comparable OS to that of patients with PS0. We believe that these low-risk patients with PS1 alone and favorable baseline characteristics should be immigrated to BCLC-B stage and defined as “candidates” for TACE. On the contrary, PS1 still remained a prognostic factor in high-risk HCC patients (grade C and D) with poor baseline characteristics, who were defined as "non-candidates" for TACE and should stay in BCLC-C stage.

As mentioned before, TACE has been frequently used to treat advanced HCC patients, especially for those with liver-confined HCC with PS1 alone11,21,32. In addition, TACE has been proven to be superior to sorafenib in advanced patients with PS1 alone and a single tumor12. In this study, the median OS for patients esteemed as the candidates ranged from 21.7 to 47.2 months in different HAP models, which was better than those receiving systemic therapy15,16,17 and roughly equal to patients undergoing TACE in previous reports19,21. Hence, inappropriate inclusion of these candidates in advanced stage might miss the opportunity for aggressive therapy including TACE and favorable outcomes. It should be also noted that these HAP scoring models performed differently in identifying candidates for TACE. The mHAP model outperformed the others in terms of selecting out the most candidates suitable (174, 23.4%) for TACE.

This study had several limitations. First, as mentioned before, due to the retrospective design, the evaluation of PS based on clinical data records inevitably introduced subjective effect and information bias4. In order to control these biases, the assessment of PS was conducted by three independent experienced clinicians; there would be a careful discussion and instant follow-up when disagreement occurred, especially for patients with no or mild tumor-related symptoms. Second, considering that the majority of patients in our study were Chinese with HBV infection, the generalization of the findings to non-Chinese populations and those with different etiologies should be cautious; the inclusion of HBV-related factors, such as HBV DNA levels, may further improve the precision and validity of the statistical analysis. Finally, external validation of current findings is needed; Given that the HAP score is not routinely utilized in pre-TACE evaluations, it would indeed be more clinically relevant to pinpoint and deliberate on particular factors or patient subgroups with PS 1 who may still derive benefits from TACE.

Conclusion

In summary, PS1 was a significant prognostic factor for patients with liver-confined HCC, but its predictive abilities varied with their baseline characteristics. Risk stratifications based on HAP scoring models could discriminate patients in grade A or B, who could be immigrated to BCLC stage of B and benefit from TACE treatment.

Materials and methods

Patient eligibility

A total of 1010 consecutive HCC patients undergoing TACE from January 2010 to October 2021 were retrospectively screened in fifteen Chinese tertiary hospitals. While TACE is a well-established treatment for intermediate-stage HCC, its off-label use occur in certain clinical scenarios33. In real-word clinical practice in these centers, TACE was also used in selected early-stage HCC patients, serving as a bridging therapy for liver transplant candidates. For selected advanced-stage HCC patients without distant metastasis but with local vascular invasion, preserved liver function, and mild symptoms (PS1), TACE was performed as a palliative treatment for alleviating symptoms and prolonging survival. Some patients with recurrent HCC or ineligible for surgery or ablation also received TACE. Generally, patients were considered unsuitable for TACE if they exhibited poor liver function, complete main portal vein thrombosis, distant metastasis, severe comorbidities, or poor general conditions (PS ≥ 2). It is worth mentioning that patients with a history of decompensation but who are compensated at the time of TACE are not automatically excluded. Their eligibility is assessed based on their current clinical status and risk factors, with careful consideration of their medical history.

In this study, patients were included based on the following criteria: (I) diagnosed with HCC according to BCLC guidelines; (II) received TACE as the initial treatment; (III) having complete baseline and follow-up information. The exclusion criteria were as follows: (I) previous treatments; (II) presence of vascular invasion or extrahepatic spread; (III) liver function beyond Child–Pugh class A; (IV) ECOG PS score ≥ 2; (V) decompensated events of cirrhosis at the time of TACE, including ascites, jaundice, hepatic encephalopathy, infections or portal-hypertensive hemorrhages. Finally, 745 patients were included in the whole cohort (Fig. 2).

Clinical, laboratory and imaging data were evaluated and collected by three independent clinicians from the database of each enrolled hospital. The research protocol complied with the ethical guidelines of the 1975 Declaration of Helsinki and was approved by the ethics committee of Air Force Hospital of Western Theater Command. Informed consent was obtained from all the patients.

Diagnosis and definitions

HCC was diagnosed by contrast-enhanced magnetic resonance imaging (MRI) and computed tomography (CT) according to the guidelines of the American Association for the Study of Liver Diseases or the European Association for the Study of Liver Diseases34,35.

Performance status was determined by using the ECOG PS scale of 0 (asymptomatic) to 4 (restricted to bed). A score of 0 indicates that the patient is fully active and able to carry out all pre-disease activities without any restriction. A score of 1 signifies that the patient is restricted in physically strenuous activities but remains ambulatory and capable of performing light or sedentary work. A score of 2 means the patient is ambulatory and capable of self-care but unable to perform work activities; they are up and about more than 50% of waking hours. A score of 3 indicates that the patient is capable of only limited self-care and is confined to a bed or chair for more than 50% of waking hours. A score of 4 describes a patient who is completely disabled, unable to carry out any self-care, and totally confined to a bed or chair36.

The HAP scoring system and its variants are established to predict survival in patients with HCC undergoing TACE. The original HAP score incorporates albumin, bilirubin, and AFP to help stratify patients into risk groups to guide treatment decisions22. While the modified HAP I-III scoring systems refines the original score by incorporating additional prognostic factors including tumor size, tumor number and portal vein invasion23,24,25. The specific scoring criteria are shown in Fig. 1.

Treatment and follow-up

During TACE procedure, an emulsion of mixtures of lipiodol (3–30 ml) and chemotherapeutic drugs was injected. Doxorubicin (10–50 mg), cisplatin (10–110 mg), epirubicin (10–50 mg), or oxaliplatin (100–200 mg) were selected according to the practice of each center. Afterwards, either galantine sponge or polyvinyl alcohol foam particles were introduced, and the embolization was monitored until the tumor arterial flow was reduced as observed on angiography. Tumor-feeding vessels were embolized either selectively or super-selectively, and “on demand” TACE procedures are scheduled at an interval of 6 to 12 weeks upon the demonstration of viable tumors or intrahepatic recurrences by CT or MRI in patients with favorable clinical and laboratory findings (performance status, liver function, etc.), as well as the absence of extrahepatic spread or vascular invasion. All procedures were performed by investigators with at least eight years of TACE experience. OS was defined as the time from the date of TACE until death or the date of the last follow-up, and the last TACE procedure took place in October 2021.

Statistics

Categorical variables were described by frequencies and percentages, and continuous data were described by median and interquartile range. OS was estimated using Kaplan–Meier curves and compared by log-rank test. Cox proportional hazards regression models were used to analyze prognostic factors associated with outcomes, where PS (PS0 vs. PS1) was used as a stratified covariate. The series of HAP scoring models (Fig. 2) were separately used for risk-stratification, and the prognostic value of PS was calculated in each classification. Considering the confounding effects of baseline characteristics on outcomes, predictive abilities of PS were adjusted for differences in multivariate Cox regression models and propensity score matching. First, each baseline variable was examined in a univariate Cox regression model and then adjusted in a multivariate model to find independent predictors of OS. Second, we performed a 1:1 nearest-neighbor PSM with a caliper value of 0.2 to remove possible confounders and variables associated with survival. Statistical analyses were performed using SPSS software version 25.0 (SPSS Inc., Chicago, IL, USA).