Introduction

Hepatocellular carcinoma (HCC) ranks as the sixth most prevalent cancer worldwide and the fourth most common cause of cancer-related mortality. The disease is one of the most aggressive human malignancies1,2,3. In Egypt, HCC ranks as the fourth most common cancer and the leading cause of cancer-related death4. The incidence of diagnosed HCC cases has doubled within the past decade5. While enhanced screening efforts have contributed to this increase, the predominant factor driving this trend is the endemic prevalence of hepatitis C virus (HCV) infection, the major risk factor for HCC development in Egypt6.

Advancements in treatment were associated with improved survival rates for cirrhotic patients, who have a higher risk of developing HCC4. Egypt’s pioneering national HCV screening and treatment campaign, initiated in 2018, represents a global benchmark in combating the disease. Through the administration of direct-acting antiviral therapy to over 2 million individuals, the program has been instrumental in identifying numerous HCC cases during follow-up surveillance using ultrasound and Alpha fetoprotein (AFP) testing6,7. AFP is a well-established biomarker for HCC screening in patients with chronic hepatitis and is a key diagnostic criterion when levels surpass 400 ng/mL, excluding pregnancy-related elevations. Approximately two-thirds of HCC patients exhibit elevated AFP levels8,9.

The low survival rate of HCC can be attributed to two key factors. First, the disease often presents asymptomatically in its early stages, making early diagnosis challenging. Second, effective treatment options are limited when the cancer is diagnosed at later stages, particularly after metastasis (10). This aggressive progression is driven by the accumulation of genetic and epigenetic alterations, ultimately leading to cancer development and metastasis11.

Treatment options for hepatocellular carcinoma (HCC) are highly dependent on tumor staging and liver function, as structured by the updated Barcelona Clinic Liver Cancer classification system. Early-stage HCC is typically treated with curative options such as surgical resection, ablation, or transplantation. Advanced disease warrants systemic therapies. While Sorafenib historically constituted the primary treatment, the current first-line standard includes anti-PD-L1 combination therapies, either with anti-VEGF agents or anti-CTLA-4 substances12.

Long noncoding RNAs (lncRNAs) are a class of non-coding RNAs greater than 200 nucleotides in length. These biomarkers are key regulators in several physiological and pathological processes. LncRNAs show differential expression patterns across diverse cancers, affecting their growth and survival potential13. During normal growth and development, lncRNAs play essential roles in modulating immune responses and regeneration, maintaining the liver microenvironment. However, the persistent proliferative signals caused by dysregulated lncRNAs often lead to liver tumorigenesis. Aberrant transcriptional or processing events may result in the upregulation of oncogenic lncRNAs or the silencing of tumor-suppressing lncRNAs, leading to conditions such as chronic hepatitis, liver overgrowth, and oxidative stress, which in turn drive the initiation and progression of hepatocellular carcinoma (HCC)14.

In HCC, numerous lncRNAs have been studied and found to promote many of these hallmarks such as proliferation, invasion, angiogenesis, and migration, while inhibiting cellular apoptosis15. These functions are mediated through mechanisms such as binding to DNA, RNA, or proteins, inducing epigenetic modifications, encoding small peptides, or acting as miRNA sponges that affecting their activities14. Recently many studies investigated the role of several lncRNAs in HCC progression; for instance LINC00152 found to promote cell proliferation through the regulation of CCDN116. UCA1 was found to have similar effect on the proliferation and apoptosis of HCC however the exact mechanism is not completely revealed yet17. HOTAIR was found to be associated with poor overall survival and disease-free survival in HCC patients. Other lncRNAs, such as H19 and MALAT1, have also been linked to HCC progression and poor prognosis. MALAT1 has been found to promote aggressive tumor phenotypes and facilitate progression18. On the other hand, some lncRNAs were found to have a role in the inhibition of cancer cells proliferation and activation of apoptosis such as GAS5 which act by triggering CHOP and caspase-9 signal pathways19.

HCC-associated lncRNAs are detectable in body fluids, making them accessible and analyzable, which highlights their potential as valuable biomarkers for liquid biopsy in HCC. Emerging studies indicate that the expression levels of specific lncRNAs in the bloodstream offer promise as non-invasive biomarkers for the early detection and management of HCC15. Previous studies have identified several lncRNAs, including ENSG00000258332.1, LINC00635, SNHG1, LINC00152, LINC00853, HULC, UCA1 and other lncRNAs as potential diagnostic markers20,21,22,23. For example, serum lncRNA-WRAP53 has been identified as an independent prognostic marker, capable of predicting a high relapse rate in HCC patients24. Another study has used lncRNA-WRAP53 in combination with UCA1 and AFP to improve the prediction power20,21,22,. Similarly, LINC00152 has been reported as a potential biomarker for HCC diagnosis, the reports also reflected its better diagnostic power upon its combination with AFP or with both AFP and HULC21,22. These studies confirmed that, These lncRNAs represent promising candidates as early diagnostic biomarkers, enabling timely intervention and potentially enhancing patient outcomes, especially if a combination of multiple lncRNAs are used alongside with the well-defined HCC biomarkers such as AFP25.

This study aimed to evaluate the diagnostic and prognostic utility of four lncRNAs: UCA1, GAS5, LINC00152, and LINC00853; selected based on previous literature20,21,22,23,26 and to use them as a combined diagnostic panel in integration with conventional liver function biomarkers. We also developed a machine learning (ML) model for accurate diagnosis of HCC using a combination of laboratory data including plasma levels of the selected lncRNAs and standard laboratory liver function tests.

Patients and methods

Study population and sample collection

Fifty-two newly diagnosed adult patients with HCC were recruited from the Medical Oncology Department of Shefaa Al Orman Oncology Hospital, Egypt. Thirty Age-matched healthy controls were also included in the study, sample size was calculated targeting power of 80% and confidence level of 95%, means and standard deviation of the studied lncRNAs expression levels from previous studies were used for calculations. Plasma samples were obtained from both groups: for HCC patients, samples were retrieved from the Shefaa Al Orman Biobank (SOH-BB), while control samples were collected following standard protocols. All participants provided written informed consent for study participation. The study protocol was approved by the ethical committee of Shefaa Al Orman under reference number SOH-IRB 09/2023.

Eligible patients were adults 18 years or older diagnosed with HCC according to the LI-RADS imaging criteria or histopathological examination of tissue biopsy. All patients were treatment-naive before sample collection. The control group consisted of age- and gender-matched healthy individuals without a history of liver disease, cancer, or chronic inflammatory disorders. These individuals were selected from the pool of blood donors in Shefaa Al Orman hospital.

Exclusion criteria included patients on immunosuppressive drugs, a history of chronic inflammatory diseases, non-HCC liver tumors, or other past or concurrent malignancies. Additionally, patients were excluded in case of incomplete medical records or insufficient available samples. Patients younger than 18 years old, with conditions such as hereditary hemorrhagic telangiectasia, Budd-Chiari syndrome, or cirrhosis due to congenital hepatic fibrosis, were also excluded to avoid false-positive results.

Clinical and laboratory data

Clinical and laboratory data were collected from the medical records of all HCC patients. This included measurements of serum levels for alanine aminotransferase (ALT), aspartate aminotransferase (AST), AFP, total bilirubin, and albumin. These same laboratory tests were also performed on the control plasma samples.

RNA isolation and cDNA synthesis

Total RNA was isolated from samples using the miRNeasy Mini Kit (QIAGEN, cat no. 217004) according to the manufacturer’s protocol. Reverse transcription into complementary DNA (cDNA) was carried out using the RevertAid First Strand cDNA Synthesis Kit (Thermo Scientific, cat no. K1622). The reverse transcription reaction was performed on a T100 thermal cycler (Bio-Rad).

Quantitative real-time PCR (qRT-PCR)

To quantify the relative expression levels of the four lncRNAs, qRT-PCR was employed. The PowerTrack SYBR Green Master Mix kit (Applied Biosystems, cat no. A46012) and a ViiA 7 real-time PCR system (Applied Biosystems, Foster City, CA, USA) were used for this purpose. Primer sequences for qRT-PCR, designed by Thermo Fisher Scientific, are provided in Table 1. The housekeeping gene glyceraldehyde-3-phosphate dehydrogenase (GAPDH) was used for normalization of expression data. Each qRT-PCR reaction was performed in triplicate. The ΔΔCT method was used for relative quantification and data analysis, with results expressed accordingly27.

Table 1 Sequences of used primers for qRT-PCR.

Statistical analysis

Statistical analysis was performed using Minitab 17.1.0.0 for Windows (Minitab Inc., 2013). Data normality was assessed with the Shapiro-Wilk test. Continuous variables were presented as medians and interquartile ranges (IQR), while categorical variables were expressed as frequencies and percentages. Non-parametric data comparisons between patients and controls were performed using the Mann-Whitney U test for numerical data and Chi-square tests for categorical data. Receiver operating characteristic (ROC) curves were generated to evaluate the diagnostic potential of lncRNAs, AFP, ALT, and AST for HCC. A general linear model with stepwise forward selection was used to identify factors influencing lncRNA levels in HCC patients. Multiple logistic regression analysis, adjusted for age and sex, was employed to assess the role of lncRNAs in predicting HCC mortality. All statistical tests were two-sided, and a p-value of less than 0.05 was considered statistically significant.

Machine learning model development

Machine learning models for diagnosis of HCC using a combination of laboratory data including lncRNAs and other laboratory data was implemented using Python libraries Scikit-learn. The models were implemented using different classification algorithms, such as Gaussian Naïve Bayes, Gradient Boosting, support vector machine and logistic regression, to compare their predictive performance and select the outperforming model. Our model choices were based on the following reasoning. SVM is well-suited for high-dimensional nonlinear separable datasets. Also, SVM’s ability to maximize the margin between classes (especially binary classes) makes it robust to overfitting, especially with smaller datasets. Naïve Bayes is an algorithm that provides a probabilistic framework that can be useful for medical diagnosis, where probabilistic outputs can aid clinical decision-making. Gradient Boosting is an ensemble method known for its ability to model complex, non-linear relationships by combining weak learners (usually decision trees). It performs well in cases where feature interactions are unknown or difficult to model explicitly. Finally, Logistic regression is a simple linear model that provides interpretable results.

Those machine learning models were experimented using different combinations of laboratory data. The steps to implement a machine learning model includes data selection, data cleaning and normalization, data transformation, data splitting to training set and validation set and finally model training using classification algorithm with evaluation using different metrics. We utilized cross validation techniques, 100 different splitting of data sets, in the training and validation to get robust predictive models.

To enhance the robustness of our predictive models and ensure accurate performance evaluation, we adopted the RepeatedStratifiedKFold cross-validation technique. Specifically, we used 5 folds (k = 5) and repeated the process 5 times. This method ensures that the class distribution is preserved in each fold, and the cross-validation process is repeated multiple times, each time with a different random split. By employing RepeatedStratifiedKFold with 5 folds repeated 5 times, we reduced the variance of the model evaluation metrics, providing a more stable estimate of model performance and preventing overfitting to a specific train-test split. This approach allowed each model, from the model to be trained and validated on 25 different data splits (5 folds × 5 repetitions), ensuring that the performance estimate was less likely to be overly optimistic or pessimistic, which can happen if only a single cross-validation is performed.

Results

Clinical and demographic characteristics of participants

The study population consisted of adult patients with a median age of 63 years (IQR: 58–68 years). Males comprised most participants (81.01%). Other clinical features, data about staging, and treatment regimens are presented in Table 2.

Table 2 Clinical features of HCC patients.

lncRNA expression

Analysis of lncRNA expression showed significantly higher levels of LNC0052, LNC00853, UCA1, and GAS5 in HCC patients compared to controls (p-values = 0.04, 0.05, 0.01, and 0.01, respectively). Similarly, serum levels of AST, ALT, total bilirubin, and AFP were significantly elevated in HCC patients (p-value < 0.01 for all) (Table 3). ROC curve analysis (Fig. 1) demonstrated moderate power of discrimination for lncRNAs, with area under the curve (AUC) ranging from 63 to 66%. While AFP exhibited good discrimination (AUC = 73%). Comparative analysis of AUC values revealed AST as the superior marker compared to ALT and AFP, while no significant differences were observed among the lncRNAs (Supplementary Tables 1 and 2). While AST and ALT demonstrated strong discriminatory power, as evidenced by AUC values of 98% and 90% respectively, their lack of specificity for HCC renders them unsuitable as standalone diagnostic markers. The integration of these traditional liver function tests with lncRNA biomarkers is imperative to enhance diagnostic accuracy. Table 4 summarizes the optimal cut-off values, selected as the point with both highest sensitivity and specificity, for differentiating healthy controls from HCC patients. Sensitivity for the selected lncRNAs ranged from 60 to 83%, while specificity ranged from 53 to 67%.

Table 3 Expression of LN0052, LN00853, UCA1 and GAS5 long non-coding RNA in patients with HCC.
Fig. 1
figure 1

ROC curve of LN0052, LN00853, UCA1 and GAS5 long non-coding RNA and AST, ALT and AFP. A: area under curve, p< 0.05 considered significant.

Table 4 Diagnostic utility of LN0052, LN00853, UCA1 and GAS5 long non-coding RNA in HCC.

Associations between lncRNAs and clinical features

Table 5 summarizes the associations between lncRNA expression and various liver conditions in HCC patients. Positive Hepatitis C virus (HCV) infection correlated with elevated LINC00152 expression (p = 0.001) but decreased UCA1 expression (p = 0.05). Conversely, positive Hepatitis B virus (HBV) infection was only associated with increased LINC00853 expression (p = 0.001). Interestingly, liver cirrhosis displayed a distinct lncRNA profile: UCA1 upregulation and downregulation of LINC00152 and GAS5 (p = 0.02, 0.001, and 0.001, respectively).

Table 5 Factors influencing the expression of lncRNAs.

lncRNAs and mortality prediction

Our cohort exhibited a high mortality rate exceeding 81% regarding a one year time window. Table 6 shows the potential of lncRNAs as mortality predictors after adjusting for age and sex. Notably, higher levels of LINC00152 and lower levels of GAS5 were significantly associated with an increased risk of mortality (OR = 1.01 and 0.98 with p = 0.02 and 0.001, respectively).

Table 6 Role of lncRNAs expression in predicting mortality of HCC.

OR: odd ratio, CI: confidence interval, the test of fitness: Hosmer-Lemeshow, X2 = 3.4, p = 0.98, the test of significant: Multiple logistic regression model with adjustment for age and sex, p ≤ 0.05 considered significant.

LncRNAs and survival probability

Following patient classification based on the lncRNA cutoff point in Table 4 the Kaplan-Meier analysis with log-rank tests revealed no significant difference in overall survival (defined as the interval between the initial diagnosis and death from any cause) probability between the higher and lower expression groups as shown in Table 7. Moreover, Cox regression analysis revealed no significant association between lncRNA expression and overall survival. (Supplementary Table 3).

Table 7 Survival time in different higher and lower expression groups of lncRNAs.

Diagnostic performance with machine learning

The machine learning model achieved the best prediction accuracy by combining traditional laboratory data (ALT, AST, total bilirubin, albumin, and AFP) with lncRNAs. This combined approach significantly improved prediction power compared to using traditional data alone or individual lncRNAs. Support Vector Machine (SVM) and Logistic Regression algorithms showed the strongest performance, reaching recall (sensitivity) of 100% and 97% with precision of 93% and 96.7%, respectively. Even after exclusion of ALT and AST from the model and including only lncRNAs, AFP, bilirubin, and albumin, the models achieved a sensitivity of 93% and precision of 97.5%. in contrast, removal of ALT and AST from the model using only traditional data, without lncRNAs, caused a noticeable decline in the prediction power. These results indicate superior importance of lncRNAs over ALT and AST despite the high differentiation power of ALT and AST between control and HCC. The superiority of lncRNAs is obviously due to their specific increase with HCC but not the other hepatic conditions in contrast with ALT, AST and the other traditional liver function markers. This data doesn’t deny the role of traditional liver function tests data for the model but actually reflects the importance of the integration of lncRNAs data with the other liver function data for more accurate and specific prediction of HCC, this aim is perfectly achieved by using the machine learning model with the whole data panel. Table 8 shows the results of AI models using different combinations of data sets.

Table 8 Results of machine learning models using different panels of data.

Correlation matrix analysis

Figure 2 presents the correlation matrix for all included laboratory data. The figure shows minimal to weak correlations between most studied factors, apart from ALT and AST, which exhibited a strong positive correlation as expected. These correlation matrix results support the importance of considering multiple parameters, including lncRNAs, for optimal model accuracy.

Fig. 2
figure 2

Correlation matrix shows the correlation between each pair of the biomarkers used in the study 0 = LINC00152, 1 = LINC00853, 2 = UCA1, 3 = GAS5, 4 = AST, 5 = ALT, 6 = total bilirubin, 7 = albumin and 8 = AFP.

Discussion

Hepatocellular carcinoma (HCC) poses a significant public health challenge in Egypt. Early detection is crucial for optimal patient outcomes. This study aimed to develop a machine learning model for improving HCC diagnosis by integrating long non-coding RNA (lncRNA) biomarkers with conventional liver function tests. In hepatocellular carcinoma (HCC), conventional biomarkers such as ALT and AST are commonly elevated due to liver damage, but this elevation is non-specific, occurring in various liver conditions, including hepatitis, cirrhosis, and general liver injury9. ALT and AST measure hepatocyte integrity but lack specificity to the molecular alterations unique to HCC. In contrast, lncRNAs like UCA1, GAS5, LINC00152, and LINC00853 are involved in HCC-specific oncogenic processes, including the regulation of cellular proliferation, apoptosis, and metastasis, which directly correlate with cancer pathology13. These lncRNAs provide insight into the molecular underpinnings of HCC that ALT and AST do not capture. By integrating these lncRNAs into our diagnostic model alongside ALT and AST, we significantly improved specificity and sensitivity, enhancing our ability to distinguish HCC from other liver conditions more accurately. This integration employs the unique predictive information offered by lncRNAs, which enhances the overall diagnostic power of the model and addresses the limitations of traditional liver enzymes in HCC screening.

Previous studies reported the changes in the expression of lncRNAs in HCC tissues28 and their elevated levels in patients’ serum samples26. All lncRNAs selected in this study have established roles in HCC pathogenesis, UCA1, GAS5, LINC00152, and LINC00853, were chosen based on their established functional relevance in hepatocellular carcinoma (HCC) and their demonstrated potential as diagnostic biomarkers. UCA1 is extensively documented as an oncogenic lncRNA involved in various malignancies, including HCC. It promotes cellular proliferation, migration, and resistance to apoptosis, partially through its interaction with key pathways such as the Hippo pathway, which influences tumor growth and survival17. Elevated UCA1 expression has been associated with poor outcomes in HCC patients, suggesting its potential as an indicator of aggressive disease progression13. On the other hand, GAS5 functions in contrast as a tumor suppressor, with its downregulation in HCC linked to enhanced proliferation and reduced apoptosis. Studies indicate that GAS5 plays a role in cell cycle arrest and apoptosis through mechanisms such as caspase-dependent endoplasmic reticulum stress pathways11. These contrasting roles of UCA1 and GAS5 provide complementary insights into the disease biology and justify their inclusion as diagnostic markers. LINC00152 has been shown to promote cell proliferation and migration through modulation of cyclin D1 (CCND1), with high expression linked to poor prognosis16. Furthermore, LINC00152 acts as a competing endogenous RNA (ceRNA), affecting oncogenic pathways by binding miRNAs that regulate tumor suppressor genes. LINC00853, although less extensively studied in HCC, has emerging evidence supporting its role in cellular proliferation and invasion, making it a promising candidate for further investigation as a diagnostic biomarker for HCC23. Although several lncRNAs were initially tested early during this study, our final model prioritized the combination of these selected four lncRNAs that optimized predictive accuracy while minimizing complexity and cost.

Our findings show that lncRNAs alone offer moderate sensitivity and specificity for HCC diagnosis. Additionally, some of the investigated lncRNAs demonstrated a prognostic association with mortality risk. The machine learning model we implemented significantly enhanced diagnostic sensitivity and specificity, which highlights the potential of this approach for improved early screening and diagnosis of HCC.

To our knowledge, this study represents a pioneering effort in utilizing a machine learning model for HCC diagnosis by integrating lncRNAs with standard laboratory data. Using the data processing capabilities of machine learning, we achieved significant improvement in diagnostic performance, with sensitivity and specificity approaching 100%. Furthermore, the developed model was translated into a user-friendly web application, which was piloted by healthcare professionals. Their feedback indicated a straightforward user interface that delivers rapid and accurate results based on laboratory data. This cost-effective approach holds promise for large-scale screening, enabling cost-efficient testing of a vast population compared to conventional diagnostic methods. Utilizing readily available laboratory data for screening has the potential to decrease the financial burden on the healthcare system, facilitating broader and more efficient service delivery.

Previous research has evaluated the use of artificial intelligence and accuracy of machine ML for prediction and/or diagnosis of HCC, and documented variations in the accuracy of different models. Sato et al. compared different algorithms (logistic regression, SVM, gradient boosting) using clinical data and found that gradient boosting exhibited the highest accuracy29. Angelis et al. who used a publicly available HCC dataset to evaluate different techniques for feature selection and classification, also achieved the best results (84% accuracy, 93% precision) with gradient boosting30. Wong et al. reported that ridge regression and random forest models offered comparable performance to traditional scores such as CU-HCC (California University-Hepatocellular Carcinoma) and GAG-HCC (Ghent-Amsterdam-Gothenburg-Hepatocellular Carcinoma) for HCC prediction in HBV/HCV patients31. In our study, Support Vector Machine and Logistic Regression algorithms showed the strongest performance. These findings highlight the importance of algorithm selection and potential variations in model performance for HCC diagnosis. On the other hand, although the performance of the developed model in this study approaches 100%, the results of machine learning model are somewhat sensitive to sample size, experimental setup and data sets. So the performance might be different with different data sets which make the validation of the model with different data sets highly recommended to ensure the clinical applicability.

Studies have also investigated the use of genetic data in ML models for HCC prediction. Chen et al. used a random forest model to investigate the potential of HBV reverse transcriptase gene potential HCC prediction. Their model achieved optimal performance using a combination of 10 features, demonstrating robustness across diverse HBV genotypes and sequencing depths32. Similarly, Tao et al. applied a random forest model to differentiate HCC from chronic HBV infection based on ctDNA copy number aberrations. The model achieved robust performance in the two validation cohorts they evaluated33.

Our study identified a significant association between increased mortality risk in HCC patients and both higher expression levels of LINC00152 and lower expression levels of GAS5. LINC00152, is known to be aberrantly expressed in various cancers, and has been linked to cell proliferation, migration, invasion, therapeutic resistance, tumor growth and metastasis34. Previous research established LINC00152 overexpression in HCC tissues compared to healthy controls and demonstrated its role as an independent prognostic factor associated with poorer patient survival35,36, suggesting its potential as a therapeutic target for HCC37.

In contrast to LINC00152, GAS5 demonstrated a protective effect against mortality in HCC patients, despite exhibiting higher expression levels in HCC compared to controls. Prior studies have documented the tumor suppressive role of GAS5 in HCC, including enhancing radiosensitivity, inhibiting invasion, and poor prognosis associated with its downregulation38,39,40. Collectively, our findings suggest a complex role for GAS5 in HCC, potentially playing a part in tumor initiation but also exerting a protective effect against disease progression.

Our study has some limitations. First, the study has inherent limitations related to patient demographics. The study population’s mean age of 63 years and predominantly male composition (80%) align with the typical HCC patient profile41,42. However, these characteristics might influence the model’s generalizability to populations with varying age and gender distributions. Another limitation is the relatively small sample size. While our findings provide valuable insights, a larger cohort could strengthen the generalizability of the results. We focused only on analyzing circulating lncRNA levels in plasma which is suitable for screening purposes. However, integrating this data with tissue expression levels of the same lncRNAs would have offered a more comprehensive perspective. This combined approach could have provided valuable validation for our findings, offering a deeper understanding of the role of these lncRNAs in HCC. Finally, it is important to emphasis that, although the performance of the models approaches 100% ,which is very promising, the model needs to be validated on different data sets before stepping forward to clinical application to minimize the effect of sample size and variability between data sets on the results.

Conclusions and recommendations

Our study shows that lncRNAs offer moderate diagnostic value for HCC. However, the implementation of a machine learning model that incorporates lncRNAs with standard laboratory data significantly improves their diagnostic utility. This model can be readily translated into a user-friendly interface, such as a website or mobile application, facilitating convenient use by healthcare professionals. The simplicity of the model, coupled with the relative speed and affordability of the underlying laboratory tests, positions it as a promising tool for screening on a large-scale.

Future research directions include evaluating the model’s robustness and prognostic prediction capabilities on a larger patient cohort. Additionally, investigation into a broader panel of lncRNAs holds promise for further refinement and optimization of the model. Moreover, investigating the model’s ability to differentiate HCC from other benign liver diseases presents a promising avenue for future research.