Main

Precision medicine for patients with advanced HCC has lagged behind other cancers. This is not because HCC has no discernible subtypes, but because targeting these has proved challenging. Tyrosine kinase inhibitors (TKIs; such as sorafenib8 and lenvatinib9) were the only first-line treatments for unresectable HCC until 2020. Thereafter, the IMbrave150 study (atezolizumab with bevacizumab)10 highlighted the potential of combination approaches with immune checkpoint inhibition (ICI) therapy, with enhanced responses for some patients and improved overall survival. Alongside advances in treatment options came an increased appreciation that heterogeneous treatment responses in patients with HCC provide a potential for patient stratification5,6. The lack of necessity for clinical biopsies in advanced HCC has resulted in a lack of tissue from late-stage disease. This hinders advances in defining clinically relevant stratification biomarkers and mechanistic understanding within subtypes for these patients. Preclinical models offer a biological platform for disease interrogation but, currently, few models faithfully recapitulate the complexity of human disease or have been validated against transcriptomic and phenotypic human HCC profiles11,12. There is therefore currently a need for human-relevant preclinical models to investigate therapy efficacies, providing guidance on subtype-specific treatments for different patient populations.

Development of a suite of HCC models

To address this need, we first set out to generate a broad range of mouse models guided by the most commonly found genetic drivers of human HCC4. Human HCC is thought to evolve from a hepatocytic clonal origin under specific conditions promoting carcinogenesis, in contrast to recently described non-malignant clonal expansion3,13,14,15,16. We reproduced this aspect of cancer biology in our models by introducing the genetic alterations into adult mouse hepatocytes using conditional recombination technology and allowing the premalignant clones to evolve to HCC over time.

We intravenously injected adult mice with a viral vector encoding Cre recombinase with a hepatocyte tropism due to its thyroxine-binding globulin (TBG) promoter, AAV8.TBG.cre. This drove recombination of endogenous floxed alleles in individual hepatocytes in an immunocompetent environment (Fig. 1a). AAV8 was titrated to a dose (6.4 × 108 genomic copies (GC) per mouse) that resulted in solitary hepatocyte targeting at low frequency (approximately 1%) and was highly hepatocyte specific (Extended Data Fig. 1a–d). Recombination occurred primarily in the first 5 days after injection, was observed across all three hepatocyte zones17, but was significantly different between male and female mice (Extended Data Fig. 1e–h). This led to a lower tumour count and consequently extended survival in female mice after induction of HCC-related oncogenes (Extended Data Fig. 1i–k). Furthermore, varying the induction dose or mutational burden affected the tumour occurrence and the speed of progression to the end point (Extended Data Fig. 1i,l).

Fig. 1: Comprehensive characterization of the genetic HCC mouse models.
Fig. 1: Comprehensive characterization of the genetic HCC mouse models.The alternative text for this image may have been generated using AI.
Full size image

a, Experimental scheme. Conditional genetically engineered mice induced with AAV.TBG.cre virus develop tumours after clonal recombination of genes classically associated with HCC in a TCGA study4. b, Specific combinations of mutations, but not numbers of mutations, drive model-specific features such as survival, tumour proliferation (Ki-67), bleeding from tumour and metastasis in mouse models of HCC. The up arrows represent gain of function (green) and the down arrows represent loss of function (red). T.a.i., time after induction (days); HET, heterozygous; HOM, homozygous. Exact values are provided in Supplementary Table 1. c, Representative images showing that variation in macroscopic and microscopic phenotype depends on combinations of mutations. Glutamine synthetase (GS) was used as an indicator of activated CTNNB1 signalling. Scale bars, 1 cm (macroscopy) and 200 µm (microscopy). Histology for the full range of HCC GEMMs is shown in Extended Data Fig. 3. d, Representative images show lung metastases resembling the primary tumour phenotype as demonstrated by haematoxylin and eosin (H&E) and GS staining. Scale bar, 100 µm. e, Mouse HCC models present common patterns and characteristics used for identification and classification of human HCC based on in-depth histopathological examination. n = 5–7 mice per cohort as indicated by bars.

Source data

We next applied this strategy to a broad range of HCC-relevant oncogene/tumour suppressor genes using a standardized dose in male mice unless otherwise stated. We particularly focused on genes identified by a TCGA study4 belonging to the WNT pathway, the cell cycle or the RTK–RAS–PI3K pathway growth. These genes were tested in multiple combinations with each other for their potency in tumour induction. (Fig. 1b). We included models with combinations that co-occur in early disease, such as CTNNB1 + MYC or PTEN + TP53. However, we also included combinations that tend towards mutually exclusive in early disease but not in late-stage disease, such as CTNNB1 + TP53 (Extended Data Fig. 1m,n). We decreased the AAV induction titre in specific instances (cohorts 14, 15, 23 and 24, 1.28 × 108 GC per mouse) to reduce the clonal burden, facilitating progression of these more aggressive models to larger individual tumours. Genotyping of end-stage tumours confirmed a high fidelity of recombination in the alleles targeted by the AAV induction (97.4–100%) (Extended Data Fig. 1o). We monitored 35 genetically distinct models, including models with a whole-body knockout of Cdkn1a or Cdkn2a, for liver nodule growth for a minimum of 230 days after induction (Extended Data Figs. 1l and 2a).

The majority of our models (83%) developed end-stage tumours within the study timeframe and most (69%) showed a tumour penetrance of higher than 50%. Notably, some combinations, such as MYC overexpression + Trp53 alteration, which induced HCC in some but not all previously published models12,18, had very low to no tumour penetrance using our clonal evolution approach and did not reach end-stage tumours within the observed period. Reflective of human disease, we observed intratumoural haemorrhaging and/or rupture (bleeding) as well as metastatic spread to the lungs, one of the main metastatic sites in human HCC, together with bone and lymph nodes2,19 (Fig. 1b–d and Supplementary Table 1). We observed a negative correlation between an increased number of driver mutations and survival, despite a reduced clonal induction with a lower AAV titre, and a positive correlation between an increased number of driver mutations and tumour proliferation, as well as between mutational burden and lung metastasis in our cohorts (Extended Data Fig. 2b). Tumour haemorrhage did not correlate significantly with mutational burden but occurred predominantly in cohorts with a mutational pattern showing activated Ctnnb1 and Pten loss without MYC overexpression (Extended Data Fig. 2c). Macroscopic and microscopic appearances were consistent with human HCC and covered a wide range of histological subtype phenotypes microscopically. This included well-differentiated HCC (for example, cohorts 5 + 19), undifferentiated HCC (such as cohorts 23 + 28), pseudoglandular HCC (for example, cohort 30) and steatotic HCC (for example, cohort 35) (Fig. 1c and Extended Data Fig. 3). Lung metastatic lesions reflected primary tumour histopathology (Fig. 1d). Histopathological assessment of morphological parameters is currently the gold standard for differential diagnosis of liver cancer in patients20. They showed strong similarities to human HCC histopathology, including typically observed architectural patterns (trabecular, glandular, solid and nested) and cytological atypia. Different combinations of genetic alterations resulted in distinct morphologies (Fig. 1e).

In summary, we used combinatorial genetic alterations, relevant to human HCC, to drive the development of autochthonous tumours in 27 immunocompetent mouse models. Tumour growth happened progressively over several months with individual hepatocytes as the cell of origin. These models recreate key features characteristic of human HCC biology, including histopathological phenotypes and metastatic spread.

Cross-species alignment and validation

To determine how well our models further represent human HCC, we performed unbiased transcriptional analysis. We included a range of well-established carcinogen-induced (TOX) and orthotopic transplant (OT) HCC mouse models with our genetically engineered mouse models (GEMMs) to make this comparison more comprehensive (Fig. 2a).

Fig. 2: Transcriptional alignment classifies four common human/mouse (HuMo) clusters.
Fig. 2: Transcriptional alignment classifies four common human/mouse (HuMo) clusters.The alternative text for this image may have been generated using AI.
Full size image

a, Summary overview of mouse models used for transcriptional analysis. In addition to the GEMMs, described in Fig. 1, TOX and OT models were included. These include mice that were treated with diethylnitrosamine (DEN), carbon tetrachloride (CCl4) and streptozotocin (STZ), as well as multiple diets: modified western diet (MWD), American-lifestyle-induced obesity syndrome (ALIOS), high-fat diet (HFD) or normal chow (NC). b, The UMAP visualization demonstrates overlap of mouse (GEMM, TOX and OT) and human (TCGA) HCC transcriptional datasets4. c, Unbiased clustering using a Louvain community detection algorithm identifies four groups within human and mouse (GEMM, TOX and OT) HCC data. d, The distribution of the subgroups identified in c with UMAP highlights shared HuMo clusters. e, All HuMo clusters are represented in the analysed GEMMs with varying heterogeneity within the individual cohorts.

Source data

Using nonlinear dimensionality reduction (uniform manifold approximation and projection, UMAP21) we mapped mouse end-stage HCC data onto the human HCC data4 (Fig. 2b). Individual models, both genetically modified and non-genetically modified, clustered within different regions in the UMAP plot (Extended Data Fig. 4a). However, mutational status is not always indicative of signalling status22, and genomic profiling of human HCC previously showed that mutations are not exclusively prognostic of association with specific subtypes4. This is especially relevant for advanced disease stages with a relatively high mutational burden23, where different genetic alterations can influence each other. We show that, for example, mutations in CTNNB1/Ctnnb1 (human/mouse gene) do not always lead to upregulation of expression of downstream pathway targets (GLUL/Glul, LGR5/Lgr5, LECT2/Lect2 or NOTUM/Notum) in human or mouse HCC (Extended Data Fig. 4b–f). Our mouse data also support the observation that mutational status by itself is not always predictive of the resemblance between cohorts (Extended Data Fig. 4a).

We therefore went on to compare the human and mouse transcriptome data based on functionally and mechanistically relevant pathway enrichment. We used the Louvain method for community detection24 to identify groups in our human/mouse HCC dataset (Fig. 2c). We detected four major human/mouse (HuMo) clusters (Fig. 2d). Genetic mouse models are represented in all four clusters with varying heterogeneity within cohorts, whereas the purely carcinogen-induced models are representative of only HuMo cluster 2 (Fig. 2e). Pathway enrichment analysis could establish cluster-specific characteristics. HuMo cluster 1 was enriched for pathways linked to metabolism and differentiation, but had negative enrichment for proliferation and inflammatory pathways. HuMo cluster 2 was related to cluster 1 but was distinct particularly through a higher enrichment in pro-inflammatory pathways. HuMo clusters 3 and 4 were both poorly differentiated and highly proliferative, with cluster 4 showing enrichment in epithelial-to-mesenchymal transition (Fig. 3a).

Fig. 3: Individual HuMo clusters have distinct transcriptional and histological features.
Fig. 3: Individual HuMo clusters have distinct transcriptional and histological features.The alternative text for this image may have been generated using AI.
Full size image

a, Pathway enrichment analysis across the GEMMs, non-GEMMs and human TCGA-HCC data4, indicating distinct identifying characteristics, including metabolic activity/differentiation, MYC/Myc pathway activation, proliferation propensity or immune status for the four HuMo clusters. n = 371 (human) and 187 (mouse). DN, downregulated; UP, upregulated; mut, mutated; amp, amplified; fl, floxed. bd, Transcriptional alignment correlates with histopathological similarities (inflammation (b), steatosis (c) and extracellular matrix (ECM) (d)) between human (n = 334) and mouse (n = 147) liver samples from the same HuMo clusters. Data are the log odds ratio (dots) 95% confidence intervals (bars). Statistical analysis was performed using Fisher tests; P > 0.05 (open circles), P ≤ 0.05 (closed circles). e, The distribution of HuMo clusters 1 to 4 and their alignment to previously reported molecular and immune HCC classifications and signatures26,27,28,29 in a validation cohort of human HCC25. The full pathway heat map is shown in Extended Data Fig. 6 and associated statistical analysis is shown in Supplementary Table 2. n = 171 (human HCC).

Source data

To assess whether the transcriptional clustering corresponded to similar histopathological features in mice and human HCC within the same cluster, we compared our mouse tumours to TCGA tissue4. We observed that mouse and human tissue belonging to the same HuMo cluster did indeed have analogous morphological characteristics (Extended Data Fig. 5a). Tissue from HuMo cluster 1 showed well-differentiated HCC (Extended Data Fig. 5b,c). HuMo cluster 2 tissue presented with inflammation, steatosis and steatohepatitis (Fig. 3b,c and Extended Data Fig. 5d). HuMo cluster 3 and 4 tissue displayed deposition of extracellular matrix and moderately (cluster 3) to poorly (cluster 4) differentiated HCC (Fig. 3d and Extended Data Fig. 5b,c). We next validated our classification in a previously published, independent dataset of human HCC25. The patients could all be assigned a HuMo cluster with similar distribution dynamics to the TCGA dataset. Again, HuMo cluster 1 was enriched for immune-evasive signatures, including the immune-excluded subclass25, and was de-enriched for ICI response signatures, including the IFNAP signature26. Conversely, HuMo cluster 2 had higher inflammatory signalling signatures and was enriched for immune-active tumours25, but without WNT–β-catenin activation. HuMo clusters 3 and 4 featured a strong progenitor signature (CK19 mutation signature) consistent with the previously observed histological phenotype of these clusters. Only HuMo cluster 4 was significantly enriched for the inflamed HCC class with an immune-exhaustion signature and characterized by TGFβ and EMT signatures (Fig. 3e and Extended Data Fig. 6a,b).

When comparing survival across the species, there was general correlation between patients and the respective GEMMs across a range of molecular subtype classifications, including HuMo, Hoshida27 and Chiang28. Importantly, HuMo offers a distinct patient classification. This clustering approach distinguished two patient populations within the Hoshida S3 molecular subclass, namely HuMo clusters 1 and 2. Hoshida et al. implied that S3 might consist of two subpopulations with CTNNB1 as a dividing factor, but did not use this as a factor in their classification27. This distinction in our analysis resulted in differences in patient survival that were unappreciated when using the Hoshida classification; patients associated with HuMo cluster 2 had an improved survival probability relative to patients associated with the other HuMo clusters. Furthermore, this distinction separates the immune-excluded (HuMo cluster 1) from the immune-active (HuMo cluster 2) subclasses. It also surpasses previous attempts of comparing mouse and human HCC data in scale and detail11,12 (Extended Data Figs. 6a and 7 and Supplementary Table 2).

In brief, we identified four distinct clusters, common across human and mouse models, by integrating our mouse transcriptional data with human HCC transcriptional data. Our models recapitulate transcriptionally the full range of human HCC, including within individual clusters. This aligned with similar histopathological features and relative survival within clusters, with specific GEMMs representative of individual subtypes of human HCC. Moreover, our HuMo classification is able to discriminate between HCC with WNT–β-catenin activation (HuMo1) and those without WNT–β-catenin activation (Humo2) within non-proliferative tumours (Hoshida S3).

Distinct responses to therapy by subtype

To examine the translational potential of our models, we investigated response to standard-of-care treatments. We focused on one model in a proof-of-principle set of experiments. Approximately 30% of patients with HCC have mutations leading to activation of the β-catenin signalling pathway4. HCC with activated β-catenin signalling has a low enrichment score for immune signatures and has been, in most cases, associated with immune exclusion25,29. Furthermore, active β-catenin pathway signalling has been linked to ICI resistance in a prospective HCC cohort study5, suggesting a need for alternative treatment options for this patient subgroup. In the TCGA dataset, 65% (57 out of 88) of patients with mutations in CTNNB1 were associated with HuMo 1 and made up 47% (57 out of 118) of patients in that cluster (Extended Data Fig. 8a,b). Moreover, humans and mice associated with HuMo cluster 1 had immune-cell paucity and a low immune score (Fig. 3a,b and Extended Data Fig. 8c–e). We therefore identified HuMo cluster 1 as the one most likely to correspond to the group of patients with activated β-catenin pathway signalling that would benefit from alternative treatment options. Cohort 5 mice (Ctnnb1ex3/WTR26LSL-MYC/LSL-MYC, hereafter BM mice) were used as a representative model and showed phenotypic resemblance to human CTNNB1-mutated HCC.

We aimed to mimic the treatment of established tumour lesions. We therefore first performed a time-course analysis for tumour onset in the BM mouse model (cohort 5) to determine an appropriate timepoint for the start of treatment. We observed clonal induction of hepatocytes, which evolved over time into microscopic lesions and then macroscopic tumour nodules, with glutamine synthetase (GS) as a marker of β-catenin driven tumour induction (Fig. 4a–c). Tumour evolution from single clones led to moderate intertumoural and intermurine transcriptional heterogeneity in end-stage tumours, including activation of pro-tumorigenic pathways such as proliferation or angiogenesis. However, while gene expression in tumours was markedly different to non-tumour tissue, it was also consistently different compared with livers with a global hepatocytic short-term expression of the same oncogenes (Extended Data Fig. 8f–i). This implied a consistent trajectory of clonal evolution occurring during tumour progression3,13. Relevant long-term models in which this evolution can take place are essential for studying HCC in preclinical models.

Fig. 4: Testing standard-of-care therapies and new therapeutic class identification in a representative mouse cohort of HuMo cluster 1.
Fig. 4: Testing standard-of-care therapies and new therapeutic class identification in a representative mouse cohort of HuMo cluster 1.The alternative text for this image may have been generated using AI.
Full size image

a, The cohorts used in bk. b, Temporal tracking of tumour development from a single clone to established HCC in male BM (cohort 5) mice using microscopic nodule detection through GS and macroscopic whole-liver assessment. The black arrows indicate macroscopic lesions at day 90. Scale bars, 200 µm (microscopic) and 1 cm (macroscopic). c, Quantification of microscopic and macroscopic nodules and macroscopic nodule count over time in male BM mice. n = 5, 6 and 9 mice for days 15, 30/60/90 and 125 respectively. Data are mean ± s.e.m. d, The treatment scheme for ej. e,f, Treatment with the TKIs sorafenib (45 mg per kg, oral) (e) or lenvatinib (10 mg per kg, oral) (f) in male BM mice. The dotted vertical line indicates the treatment start. n = 17, 15, 5 and 6 mice for sorafenib vehicle, sorafenib, lenvatinib vehicle and lenvatinib, respectively. Statistical analysis was performed using log-rank tests. g, Combination treatment with VEGFRi (3 mg per kg, oral) and the immune-checkpoint inhibitor anti-PD1 (200 µg per mouse, intraperitoneal) in male BM mice. The dotted vertical line indicates the treatment start. n = 9 (vehicle + IgG isotype) and 8 (VEGFi + anti-PD1). Statistical analysis was performed using log-rank tests. hj, Timepoint analysis (h) and quantification of GS (i) and Ki-67 and cleaved caspase 3 (CC3) (j) immunohistochemistry at day 15 and 30 after lenvatinib treatment in male BM mice. The dotted lines indicate the tumour borders. Scale bars, 1 mm (GS) and 200 µm (Ki-67). n = 5 and 6 mice at days 15 and 30, respectively (for vehicle and lenvatinib). Data are mean ± s.e.m. Statistical analysis was performed using a two-tailed unpaired t-test (day 15) and a Mann–Whitney U-test (day 30). k, High-throughput screening of 147 FDA-approved anti-cancer drugs plus internal controls, highlights antimetabolites (red) having an effect on growth of the HCCO tumouroids from BM mice; with cladribine (cl) having the greatest effect, while lenvatinib (le)/sorafenib (so) have only modest effect on the tumour cells. The full ranking is shown in Supplementary Table 3. l,m, In vitro validation of cladribine efficacy in mouse (l) and human (m) HCCOs. n = 3 different passages from 1–2 HCCO lines per mouse cohort, technical duplicates; 3 different passages from one to five human HCCO lines per driver combination, technical duplicates. Data are mean ± s.e.m.

Source data

We started drug treatment at day 90, based on 100% of cohort 5 (BM) mice having macroscopic tumours and 96% of cohort 5 (BM) mice surviving past this timepoint (Fig. 4a–d and Extended Data Fig. 2a). Cohort 5 mice showed a significant increase in survival after treatment with the TKIs sorafenib and lenvatinib (Fig. 4e,f). However, treatment with the ICI agent anti-PD1 or treatment with ICI + VEGFRi (modelling atezolizumab + bevacizumab as first-line HCC systemic therapy10) did not impact the overall survival in this cohort (Fig. 4g and Extended Data Fig. 9a,b). These results are similar to the reported drug responses to TKIs and ICI in human patients with activated β-catenin signalling5. In mice from the immune-active HuMo cluster 2, ICI + VEGFRi resulted in an improved survival (Extended Data Fig. 9a,c). This is also consistent with the transcriptomic signatures predicting ICI response (Extended Data Fig. 6a).

Investigating disease progression after initial response to therapy, we observed changes in macroscopic and microscopic appearances in end-stage tumours of cohort 5 (BM) mice treated with lenvatinib. Tumours were different in colour and stiffer. Microscopic HCC patterns shifted from mostly well-differentiated to a poorer differentiated phenotype with a greater stromal presence (Extended Data Fig. 9d,e). Furthermore, more mice in this treatment arm presented with lung metastases compared with vehicle treatment or other treatments (Extended Data Fig. 9f). Monitoring of tumour growth using magnetic resonance imaging suggested a delayed and decreased tumour growth initially after lenvatinib treatment (Extended Data Fig. 9g). We also observed a higher metastatic burden in a second model (cohort 23, Ctnnb1ex3/WTR26LSL-MYC/LSL-MYCPtenfl/flTrp53R172H/WTCdkn2aKO/KO) with increased survival after lenvatinib treatment (Extended Data Fig. 9h–j). We hypothesized that the increased aggressiveness, manifested by morphological changes and greater metastatic burden, resulted from the extended survival coupled with an altered phenotype associated with acquired resistance to lenvatinib therapy. We therefore investigated livers of cohort 5 (BM) mice after 15 days and 30 days of lenvatinib treatment from day 90 after induction (Fig. 4d). We observed no differences in tumour morphology, but there was a decreased tumour burden through less proliferation, without increased cell death, at both the 15 and 30 day timepoints in lenvatinib-treated mice (Fig. 4h–j). There were no detectable metastases at either timepoint, supporting our hypothesis that the heightened aggressiveness in this model is a late-stage on-treatment event.

Overall, treatment responses in this specific GEMM were reminiscent of a distinct, common and difficult-to-treat subtype of HCC, characterized by a transient survival benefit observed in human phase 3 clinical studies8,9.

Screening-based therapy identification

After establishing the response to current standard-of-care treatments of mice representative of HuMo cluster 1 (cohort 5, BM), we concentrated on identifying therapeutic options for this difficult-to-treat subgroup. We performed an in vitro high-throughput screen based on GEMM-derived HCC organoids (HCCOs)30, with subsequent in vivo validation in the respective GEMM (Extended Data Fig. 10a).

HCCOs recapitulate the transcriptomic profile, histological organization and tumorigenic potential of the primary tumour31,32 and are therefore suited to investigate drug effects on tumour cells. They allow for rapid testing of a large range of drugs and for a side-by-side comparison between mouse-derived and human-derived tumour cells.

HCCOs derived from end-stage tumours of cohort 5 (BM) mice expressed β-catenin, its downstream target GS and retained MYC overexpression as well as markers of proliferation (Ki-67) and differentiation (HNF4a), features that are shared with the corresponding primary tumour (Extended Data Fig. 10b). Despite these similarities, the transcriptional phenotype of HCCOs differed from the original tumours. We propose that this is due to the simplified nature of HCCOs as an epithelial-cell-only model as well as adaptive response to the culture conditions. Overall, there was a convergence of HCCO phenotype arising from diverse GEMMs (Extended Data Fig. 10c). We tested a comprehensive drug library consisting of the 147 FDA-approved anti-cancer drugs available at the time (June 2019) plus internal controls and analysed their effect on HCCO growth (Fig. 4k and Supplementary Table 3). The most efficacious drugs were a group of antimetabolites—nucleobase analogues that interfere with DNA synthesis (Fig. 4k and Extended Data Fig. 11a). We validated the dose-dependent effect of cladribine, the most effective antimetabolite, in several distinct mouse and human HCCOs. This confirmed the results of the screen and demonstrated the nanomolar potency of cladribine (Fig. 4l,m). To establish whether this is a compound-specific effect, we tested a wide variety of antimetabolites and validated the high-throughput screen results. We demonstrated similar efficacy of clofarabine (a second-generation version of cladribine) and cladribine itself, suggesting a drug-specific on-target effect within this subclass of antimetabolites (Extended Data Fig. 11b,c). We also tested selected other drugs from our screen. Lenvatinib and sorafenib showed little tumour-epithelial efficacy in both the screen and separate validation, including in combination with cladribine (Extended Data Fig. 11b–f). Next, we treated cohort 5 (BM) mice, representing HuMo cluster 1, with either cladribine monotherapy or combination therapy of cladribine and lenvatinib, as a standard-of-care TKI (Fig. 5a,b). Cladribine monotherapy led to increased survival, but combination therapy extended survival further (Fig. 5c). Cladribine monotherapy reduced the number of tumours, but the remaining tumours still progressed to end-stage HCC. Combination therapy with lenvatinib showed a synergistic effect, almost completely eradicating all tumours (Fig. 5d,e).

Fig. 5: HuMo-cluster-specific treatment response to cladribine.
Fig. 5: HuMo-cluster-specific treatment response to cladribine.The alternative text for this image may have been generated using AI.
Full size image

a,b, Summary of the cohorts (a) and treatments (b) in cm. c, Survival after cladribine treatment alone/or in combination with lenvatinib in male BM mice (cohort 5, HuMo 1). The dotted vertical line indicates the treatment start. n = 11 (vehicle), 13 (cladribine) and 13 (cladribine + lenvatinib). Statistical analysis was performed using log-rank tests. d, The liver-weight/body-weight ratio and tumour count over time. Data are shown as bars (liver weight/body weight) and symbols (tumour count) for individual mice. n = 10 and 9 (vehicle), 11 and 11 (cladribine), and 13 and 12 (cladribine + lenvatinib). e, Representative macroscopy images of male BM mice treated with vehicle, cladribine or cladribine + lenvatinib at the indicated days. Scale bar, 1 cm. fh, BM mice treated with cladribine + lenvatinib were assessed for tumour proliferating cells (Ki-67) and CD3+ T cells. Representative images (f) and quantification of tumour area and Ki-67+ cells (g) and CD3+ cells (h). Non-tumour tissue (NT) and tumour tissue (T) are matched. Scale bars, 200 μm (H&E, GS and Ki-67) and 100 µm (CD3). n = 7 throughout. Data are mean ± s.e.m. Statistical analysis was performed using two-tailed Kruskal–Wallis tests with Dunn’s correction (tumour area) and one-way analysis of variance (ANOVA) with Tukey’s correction (Ki-67 and CD3 quantification). ik, Priming of BM mice with cladribine + lenvatinib before treatment with anti-PD-1. i, Tumour counts at the day 7 and 30 timepoints. n = 7 (cladribine + lenvatinib, day 7; cladribine vehicle + lenvatinib vehicle + anti-PD-1 isotype, day 30; and cladribine + lenvatinib + anti-PD-1 isotype, day 30), n = 6 (cladribine vehicle + lenvatinib vehicle, day 7; and cladribine vehicle + lenvatinib vehicle + anti-PD-1, day 30), n = 8 (cladribine + lenvatinib + anti-PD-1, day 30). Data are mean ± s.e.m. Statistical analysis was performed using two-tailed Mann–Whitney U-tests (day 7) and Kruskal–Wallis tests with Dunn’s correction (day 30). j, The density of cytotoxic T cells in the tumour and stroma at day 30 after priming + anti-PD-1. n = 4 (cladribine vehicle + lenvatinib vehicle + anti-PD-1 isotype), n = 6 (cladribine vehicle + lenvatinib vehicle + anti-PD-1; and cladribine + lenvatinib + anti-PD-1 (note that four mice had no tumours to evaluate)) and n = 7 (cladribine + lenvatinib + anti-PD-1 isotype, day 30). Data are mean ± s.e.m. k, Representative images. Scale bars, 100 µm. l, Treatment with cladribine with or without lenvatinib in male cohort 23 mice (Ctnnb1ex3/WTR26LSL-MYC/LSL-MYCPtenfl/flTrp53R172H/WTCdkn2aKO/KO, HuMo 4). The dotted vertical line indicates the treatment start. n = 9 (cladribine vehicle + lenvatinib vehicle), 9 (cladribine + lenvatinib vehicle) and 10 (cladribine + lenvatinib). Statistical analysis was performed using log-rank tests. IC50, half-maximal inhibitory concentration. m, Treatment with cladribine with or without lenvatinib in male cohort 45 (R26LSL-MYC/LSL-MYCKrasG12D/WT, HuMo 2). The dotted vertical line indicates the treatment start. n = 8 (cladribine vehicle + lenvatinib vehicle), 8 (cladribine + lenvatinib vehicle), 8 (cladribine vehicle + lenvatinib) and 9 (cladribine + lenvatinib). Statistical analysis was performed using log-rank tests.

Source data

Study progression to either clinical tumour end point or study end point (day 270 after induction) was limited in some animals (31% cladribine, 62% cladribine + lenvatinib) due to clinically substantial weight loss (<80%).

Treatment with either monotherapy or combination therapy showed a decrease in proliferation in end-stage tumours, but no alteration in apoptotic cell death. Notably, we observed an increase in CD3+ T cell infiltration into the tumour after combination therapy compared with vehicle treatment (Extended Data Fig. 12a–d). As the time of end point varied greatly between the different treatments, we analysed tumours at a defined timepoint of 30 days after the treatment start. Mice on monotherapy or combination therapy showed decreased tumour size and number, with a significant decrease in proliferation (Fig. 5f,g and Extended Data Fig. 12b,e). Both healthy and tumour tissue exhibited a greater extent of DNA damage (pH2AX), as expected after treatment with a nucleobase analogue, but this did not alter upregulation of another senescence marker (p53) nor apoptosis (Extended Data Fig. 12f–h). Again, we observed increased infiltration of CD3+ T cells into the tumour of mice that were treated with combination therapy (Fig. 5h). Given the lymphocyte infiltration observed after combination therapy, we tested a ‘priming’ approach with 1 week of combination therapy before ICI therapy (Extended Data Fig. 12i). This resulted in anti-tumour efficacy and immune infiltration and cytotoxicity (Fig. 5i–k and Extended Data Fig. 12j–l).

Finally, we tested whether cladribine, either as monotherapy or in combination with lenvatinib, is equally effective in mouse models representing other HuMo clusters. We treated cohort 23 (Ctnnb1ex3/WTR26LSL-MYC/LSL-MYCPtenfl/flTrp53R172H/WTCdkn2aKO/KO) mice, representing HuMo cluster 4, and cohort 45 (R26LSL-MYC/LSL-MYCKrasG12D/WT) mice (induced with a higher titre of AAV.TBG.cre than cohort 32 to increase tumour burden to make survival time comparable to cohort 5), representing HuMo cluster 2 (Fig. 5b). Both monotherapy and combination therapy were effective in prolonging the survival of cohort 23 mice (Fig. 5l). However, cladribine did not extend the survival in cohort 45 mice (a wild-type (WT) Ctnnb1 model with mutated Kras), either as a monotherapy or as a combination therapy with lenvatinib (Fig. 5m).

In this proof of concept, we demonstrated the potential of our GEMM platform to identify epithelial-targeting therapies that synergized effectively with standard-of-care treatments, the latter of which mainly targeting the tumour microenvironment. This combination of TKI and a repurposed FDA-approved anti-cancer compound led to highly effective subtype-specific treatment responses and a switch to a targetable immune phenotype.

Discussion

Using a range of genetic alterations that are frequently associated with human HCC4, we developed a suite of immunocompetent mouse models that closely resembles the development and progression of human HCC with hepatocytes as the cell of origin. Our models successfully recreate key molecular and pathophysiological events typical of human HCC, including tumour haemorrhaging and metastasis to the lungs2,19. They mimic various tumour microenvironments, such as immune-active and immune-desert or high/low stroma tumours. We demonstrated the clinical relevance of our models by integrating mouse data with publicly available human HCC datasets, defining shared subtypes and proving response to standard-of-care treatment. Furthermore, we showed that these models can be used as a preclinical platform, together with HCCOs, for investigating rapid drug repurposing, in addition to studying tumour evolution and mechanisms of drug resistance.

We appreciate that not all genetic alterations associated with HCC have been tested in this study. TERT promoter modifications, despite being frequent in human HCC (up to 60% of patients)23, are difficult to model appropriately in mice due to biological differences between species. Mice have long telomeres and it would take several generations of crossing mice with Tert deletions before detecting a noticeable effect of reactivating Tert33. This is an obstacle that will be difficult to overcome in mouse models of HCC and other means are needed to study TERT promoter mutations and their therapeutic targetability. However, as TERT promoter mutations are so omnipresent in HCC, they might be less relevant for subtyping and we did not identify a specific human TERT group that was separate from our GEMMs.

Furthermore, some combinations of genetic alterations showed low/no tumour penetrance in our GEMMs, for example, Trp53 modifications in combination with MYC overexpression, while these showed high penetrance in HCC in previous models using hydrodynamic tail vein injections12. Administration of hydrodynamic tail vein injection has been shown to cause apoptosis in the liver34, leading to higher inflammation and favourable conditions for tumour development. Moreover, levels of MYC might be a determining factor in a clone progressing to a tumour35.

Although the majority of our studies were performed in male mice, we found no indication that the results are sex specific. Indeed, when we used the same genetic alterations in female mice, we observed a similar phenotype and cluster association (cohort 5/6, BM, male/female, respectively). However, AAV.TBG.cre induction seems to be less potent in female mice, which is particularly impactful in models with a lower mutational burden. Future experiments are needed to explore further genetic alterations or risk factors predominantly associated with female HCC in patient stratification, such as Bap1 mutations or malignant transformation of hepatocellular adenomas4,36.

In contrast to the GEMMs, human patients usually present with cirrhosis, which probably influences the course of disease establishment and progression3 and impacts treatment options2,20. Future research incorporating multifaceted environmental factors in preclinical models, including advanced fibrosis, is needed to better understand HCC biology and potential differences between species. Our models can also be easily combined with environmental liver disease models, such as high-fat diets. Preliminary data from our transcriptomic analyses indicated that genetics dominate cluster association, with the addition of background fibrotic disease having little transcriptomic influence in mice (cohort 5 versus 37).

Our models strike a balance between allowing time for tumour evolution while still being time efficient. This enables future detailed investigation of tumour evolution and factors contributing to malignant transformation, especially as not all of the recombined clones expand into tumours.

Somatic mutations are poorly clinically actionable in HCC and remain difficult to target therapeutically. In the case of multiple genetic alterations, each individual contribution to tumorigenesis might be difficult to determine37,38,39. Our models with their increased complexity of multiple genetic alterations, similar to the mutational burden of late-stage HCC23, enable the exploration of alternative targets and might contribute to understanding mutational dominance in different contexts. Moreover, by mimicking clonal evolution, they might help to identify the stage in tumour development—initiation, early nodule growth, malignant transformation—when a drug has an optimal effect.

We show that HCCOs are a tractable and rapid platform to identify treatments in combination with efficacy testing in vivo, and promote the principles of the 3Rs (replacement, reduction and refinement) for humane animal research. However, current cell culture conditions limit the translatability of HCCO-based drug response predictions and, therefore, validation in animal models remains essential. Future research in HCCOs needs to overcome the reduced complexity in cell culture, a general issue in organoid culture40, and address options for co-culture with cells shaping the tumour microenvironment41. Modifying HCCOs with CRISPR technology may also provide useful insights to explore tumour biology and the mechanisms beyond drug vulnerabilities42.

Importantly, we show that our GEMMs map transcriptionally and histologically to human HCC. Using a computational biology approach has enabled us not only to position our GEMMs, and select carcinogen-induced models, against human HCC, but also to identify four shared subclasses with defining characteristics. Notably, some of our models show a degree of heterogeneity often observed in human HCC43, with tumours associated to several HuMo clusters. Our newly developed GEMMs represent all identified subtypes, whereas chemical-carcinogen-induced models included in this study only mapped to one HuMo cluster (cluster 2).

Our preclinical platform and classification system can be used as a resource for the HCC research community to streamline preclinical research and increase comparability of different mouse models. Furthermore, linking preclinical models with patient data can aid in stratifying patients to treatment, identifying new therapies and improving the likelihood of translational success. The HCCO screen enabled us to rapidly identify and test an FDA-approved anti-cancer drug, cladribine—not previously linked to HCC—in a clinically relevant model. We could show efficacy and improved survival in vivo together with standard-of-care treatment, which will allow for a swift translation into the clinic.

We believe that our approach of linking preclinical models to human data in a subtype-specific manner will also be applicable, cross-referable and advantageous in translational research of other solid cancers.

Methods

Mice, diets and treatments

All animal experiments were performed in accordance with UK Home Office licences (70/8891, PP0604995, 70/8646, 70/8468 and PP8854860) and in accordance with the UK Animal (Scientific Procedures) Act 1986 and EU direction 2010. They were subject to review by the animal welfare and ethical review board of the University of Glasgow and the University of Newcastle upon Tyne. To minimize pain, suffering and distress to the animals, we used single-use needles and non-adverse handling techniques. Mice were housed under controlled conditions (specific-pathogen free, 12 h–12 h light–dark cycle, 19–22 °C, 45–65% humidity) with access to food and water ad libitum. We added environmental enrichments, in the form of gnawing sticks, plastic tunnels and nesting material to all of the cages. Welfare of animals was defined by clinical symptoms, including visible masses, any degree of reduced mobility/distress, weight loss or evidence of haemorrhage; however no maximal tumour volume end points for intrahepatic tumours were mandated. No mouse exceeded the humane end points stipulated in the Home Office Licenses.

Unless otherwise specified, male mice on a mixed background were used. The following transgenic mice strains were used: Gt(Rosa)26Sortm14(CAG-tdTomato)Hze (R26LSL-Tom)44, Ctnnb1tm1Mmt (Ctnnb1ex3)45, Gt(Rosa)26Sortm1(MYC)Djmy (R26LSL-MYC)46, Trp53tm1Brn (Trp53fl)47, Trp53tm2Tyj (Trp53R172H)48, Cdkn2atm1.1Brn (Cdkn2aKO)49, Ptentm2Mak (Ptenfl)50, Gt(Rosa)26Sortm1(Notch1)Dam (R26LSL-NICD)51, Krastm4Tyj (KrasG12D)52, Cdkn1atm1Led (Cdkn1aKO)53, Axin1fl (ref. 54), Bap1tm2c(EUCOMM)Hmgu (ref. 55). Genotyping was performed by Transnetyx using ear notches taken for identification purposes at weaning (3 weeks of age). Mice were induced between 8 and 12 weeks of age, unless otherwise indicated, with AAV8.TBG.PI.eGFP.WPRE.bGH (AAV8-TBG-GFP) (Addgene, 105535-AAV8), AAV8.TBG.PI.cre.rBG (AAV8-TBG-cre) (Addgene, 107787-AAV8) or AAV8.TBG.PI.Null.bGH (AAV8-TBG-Null) (Addgene, 105536-AAV8). Virus was diluted in 100 µl PBS to the desired concentration and injected through the tail vein. Unless otherwise specified, mice received a dose of 6.4 × 108 GC per mouse.

For the GEMM + MWD model, 6-week-old mice were kept on a modified western diet (Envigo, TD.120528) plus sugar water (23.1 g l−1 fructose and 18.9 g l−1 glucose) in combination with repeated CCl4 injections (intraperitoneal (i.p.), 0.2 µl g−1 of body weight; vehicle, Cornoil) as previously described56 and were induced with AAV.TBG.cre at 10 weeks of age.

For the DEN/ALIOS model, C57BL/6 WT mice, were injected with a single dose of DEN (80 mg per kg by i.p. injection) at 14 days of age. Mice were fed ALIOS diet (Envigo, TD.110201) and sugar water (23.1 g l−1 fructose and 18.9 g l−1 glucose) from 60 days of age. Mice were collected at day 284.

For the MWD + CCl4 model, the mice were kept on a modified western diet (Envigo, TD.120528) plus sugar water (23.1 g l−1 fructose and 18.9 g l−1 glucose) in combination with repeated CCl4 injections (i.p., 0.2 µl g−1 body weight; vehicle, Cornoil) as previously described56.

For the streptozotocin (STZ) model, male and female C57BL/6J WT mice were injected with a single dose of STZ (200 µg in 0.1 M citrate buffer, pH 4.0) subcutaneously at 2 days of age. Mice were fed a high-fat diet (TestDiet 58R3, 1810835) from 30 days of age. All STZ–HFD-treated livers showed pale yellow colour at 6 weeks, mild swelling at 8 weeks, granular surface at 12 weeks and tumour protrusion at 20 weeks of age57. Mice were collected between 17 and 35 weeks of age.

For the orthotopic model, Hep-53.4 cells (female C57BL/6J hepatoma cell line; Cytion, LOT-L230232R) were injected intrahepatically into the left lobe of male C57BL/6J mice. The procedure was performed under isoflurane general anaesthesia. Analgesia was given to the mice for pain management. Mice were collected at 28 days after implantation or left to reach an approved humane end point.

For therapeutic intervention in the BM model, drugs were given at 90 days after induction or in other models determined by mean cohort survival relative to the BM model survival as indicated in the figures. The following drugs were used: sorafenib (LC Laboratories S8502, daily, oral gavage, 45 mg per kg; vehicle, 50% chremophor/50% ethanol, then, before dosing, 3 parts H2O added), lenvatinib (SelleckChem, S1164 (end-point studies); or Eisai (monotherapy timepoint studies); daily; oral gavage, 10 mg per kg, vehicle, 3 mM HCl), anti-PD1 (BioLegend, RMP1-14; twice per week; i.p., 200 µg; vehicle, PBS; control, IgG, BioLegend, RTK2758), cladribine (SelleckChem, S1199; daily; i.p., 20 mg per kg; vehicle, PBS), VEGFRi (AstraZeneca, AZD2171; daily; oral gavage, 3 mg per kg; vehicle, 0.5% (w/v) HPMC, 0.1% Tween-80, in H2O). To help with drug-induced weight loss, mice on cladribine treatment received irradiated peanuts and sunflower seeds as diet supplements. If mice reached 83% of their weight at treatment start, cladribine treatment was withheld until they gained weight to at least 90% of weight at treatment start. Mice who dropped below 80% of weight at treatment start were sampled according to licence limitations. Confounding factors (for example, litter mates, induction date) were taken into consideration when allocating mice into groups but mice were not randomized using a specific method. Mice who presented with a visible tumour before treatment start were excluded from the experiments according to a priori established criteria. Animal technicians dosing the mice were blinded to the genotype of the mice. The number of biological replicates was ≥3 mice per cohort for all experiments. Further details are provided in the figure legends and Supplementary Table 1.

Animal tissue collection

GEMMs were sampled at specific timepoints or at the end point. The end point was defined as the mouse having reached a liver weight/body weight ratio of >20% or having adverse side effects from the tumour, such as tumour haemorrhaging. Mice who died of tumour haemorrhaging were included in the survival analysis but not in any downstream analysis. Tumours were measured macroscopically using digital callipers, and visible tumours were counted. Images of whole livers were taken using a Canon PowerShot G9X camera with a ruler present in each picture. Tissue was either sampled in neutral buffered saline containing 10% formaldehyde or snap-frozen on dry ice.

Histology and immunohistochemistry

Liver, tumour and lung tissues were fixed using neutral buffered saline containing 10% formaldehyde, dehydrated and embedded in paraffin, and cut into 4-μm-thick sections. The sections were dewaxed and stained with H&E or Sirius Red using standard protocols. Additional sections were stained immunohistochemically using the primary antibodies listed in Supplementary Table 5. Primary antibodies were detected by HRP-labelled secondary antibodies and subsequently stained using a peroxidase DAB kit; either Agilent (K3468) or Leica (DS92563) DAB for tissue processed in autostainer or Vector Laboratories (SK-4100) with haematoxylin as a counterstain (immunohistochemistry) or by fluorescent-labelled secondary antibodies (Invitrogen) with DAPI used as counterstain (SouthernBiotech, 0100-20) (immunofluorescence).

Microscopy and quantitative analysis of immunohistochemistry

Images were obtained on the Zeiss Axiovert 200 microscope using the Zeiss Axiocam MRc camera. For image analysis, stained slides were scanned using the Leica Aperio AT2 slide scanner (Leica Microsystems) at ×20 magnification. Quantification of blinded, stained sections (GS, Ki-67, CC3, CD3, yH2AX, p53) was performed using the HALO image analysis software (v.3.1.1076.363, Indica Labs). Quantification of microscopy tumour area in BM mice was performed based on nodules, independent of GS status. Quantification of Ki-67+ was by percentage/cell number and CC3+ by percentage/tumour area.

Lungs were microscopically analysed for the presence of extrahepatic HCC spread by examining H&E and GS sections. Metastasis was scored in a binary manner as detected or not-detected but was not analysed in respect to individual metastasis burden per mouse.

Images for tissue comparison to HCCOs were taken on the Zeiss 710 confocal microscope.

Tumour genotyping

After extraction from whole tumour, DNA was suspended in Transnetyx assay buffer and was analysed by Transnetyx using probes (p53Flox EX, Bap1-2 EX, PTEN-EX, LSL-EX-1, Tg-MYC, Axin1-1 EX, Ctnnb1-16 EX) and was additionally purified and concentrated using the Monarch Genomic DNA purification kit (New England Biolabs, T3010L) according to the manufacturer’s instructions. Generation of amplicons indicating successful recombination of genetic loci was performed by PCR (Eppendorf Mastercycler x50a) using the OneTaq Quick-Load 2× Master Mix with Standard Buffer (M0486S); reactions were set up according to the manufacturer’s instructions, amplification conditions (Supplementary Table 4) and primer sequences—β-catenin exon 3 (Supplementary Table 4) and KRASG12D (Supplementary Table 4). The resulting PCR reactions were separated by electrophoresis on 1.5% agarose (Melford, A20090-500) gel, using the size marker Quick-Load Purple 1 kb Plus DNA Ladder (New England Biolabs, N0550S) and bands were visualized using SYBR Safe DNA gel stain (Invitrogen, S33102). The Gels were imaged on the Chemi-Doc Imaging System (Bio-Rad). Concordance between CTNNB1 recombination results between the two methods was 100%. Where possible, the samples used in histological comparison were also assessed genotypically. Where not possible, due to DNA contamination/low quality, tumours were replaced by other end-stage tumours from the same cohort to achieve n ≥ 6 per cohort (total n = 4 additional samples).

Quantitative analysis of fluorescence immunohistochemistry

Fluorescent tiled images were generated on the Opera Phenix High-Content Screening System (Perkin Elmer) at ×20 magnification. Fluorescence was detected using the same settings throughout. Consecutive, non-overlapping fields were analysed blindly using Columbus Image analysis software (v.2.8.0.138890, Perkin Elmer). Positivity gating thresholds were defined using negative controls. For representative images, processing adjustments were performed equally.

Multiplex immunofluorescence immunohistochemical staining

Mouse liver samples (thickness, 4 µm) were sectioned and placed onto TOMO hydrophilic adhesive microscope slides (Matsunami, 0808228600). After antibody validation, semi-automated multiplex immunofluorescence staining was performed on the Ventana Discovery Ultra platform (Roche Tissue Diagnostics, RUO Discovery Universal v.21.00.0019). Fluorescence detection was performed using an Opal fluorophore tyramide-based signal amplification system (Akoya Biosciences). All primary antibodies were optimized using a pH 9 antigen retrieval solution (CC1, Roche Tissue Diagnostics, 06414575001) at 95 °C for 32 min. A denature step was applied using pH 6 antigen retrieval solution (CC2, Roche Tissue Diagnostics, 05279798001) for 24 min between each Opal detection and primary antibody application.

The primary–secondary–opal fluorophore combinations (CD45–HRP–Opal480; CD8–HRP–Opal690; CD4–HRP–Opal620; GranzymeB–HRP–Opal650; GS–HRP–Opal520; MYC–HRP–Opal570) are described in Supplementary Table 5. ImmPRESS rat and Opal 780 were manually applied in their specific sequences, and the remaining reagents were fully automated on the Ventana DISCOVERY ULTRA platform (Roche Tissue Diagnostics). Three drops of nuclear DAPI counterstain (Roche Tissue Diagnostics, 05268826001, RTU (Ready to Use), 24 min) were applied to each sample for nuclear detection.

Multiplex immunofluorescence image acquisition and analysis

Whole-slide images were collected on the PhenoImager HT multispectral slide scanner (Akoya Biosciences, v.1.0) using a ×10 objective before the acquisition of each region of interest (ROI) using a ×20 objective. Each ROI was spectrally unmixed using InForm (Akoya Biosciences, v.2.6.0) using a project-specific spectral library created using single-channel dyes and an autofluorescence mouse liver control.

Visiopharm was used for all image analysis. For tissue segmentation, a bespoke, in-house-trained deep learning algorithm (v.2024.06.0.19093 ×64) was trained using the deep learning module with a U-Net backbone, to segment each lobe into tumour, stroma, non-tumour GS and background ROIs. Tumour regions that were smaller than 10,000 µm2 were classified as a ‘clone’ region. Tumour and clonal regions were then dilated to generate ‘peritumoural stroma’ and ‘periclonal’ regions, respectively. Necrotic regions were manually segmented. Tumour regions were eroded to create ‘tumour centre’ and ‘tumour periphery’ regions. T cells were classified with a ‘T cell’ label within these regions using the threshold module (v.2024.07.1.16745 ×64) to threshold CD4, CD8 and CD45 fluorescence channels using the original image features. Each image was verified by a pathologist to confirm regional and T cell label segmentation. Post-processing steps were included to change T cell labels into their respective regional labels, that is, a T cell found within the tumour ROI would be changed to ‘tumour T cell’ and so on. Output variables were then generated and exported for downstream data analysis. Area of entire lobes were generated, and areas for each ROI as well as regional mean pixel intensities of MYC, mean pixel intensities of each marker in each T cell label and x–y coordinates of each T cell label. The Phenoplex Guided workflow was used for T cell phenotyping, which generated a phenotype list that was exported for data analysis.

Duplex immunofluorescence immunohistochemical staining

For duplex immunofluorescence immunohistochemical staining (Extended Data Fig. 12b (bottom)), 4-µm-thick mouse HCCOs and liver lobe samples were sectioned and placed onto TOMO hydrophilic adhesive microscope slides (Matsunami, 0808228600). After antibody validation, fully automated multiplex immunofluorescence staining was performed on the Ventana DISCOVERY ULTRA platform (Roche Tissue Diagnostics, RUO Discovery Universal v.21.00.0019). Fluorescence detection was performed using an Opal fluorophore tyramide-based signal amplification system (Akoya Biosciences). All primary antibodies were optimized using a pH 9 antigen retrieval solution (CC1, Roche Tissue Diagnostics, 06414575001) at 95 °C for 32 min. A denature step was applied using pH 6 antigen retrieval solution (CC2, Roche Tissue Diagnostics, 05279798001) for 24 min between each Opal detection and primary antibody application. The primary–secondary–opal fluorophore combinations (MYC–HRP–Opal570, GS–HRP–Opal520) are described in Supplementary Table 5. One drop of nuclear DAPI counterstain (Akoya, 232121) was applied to each sample for nuclear detection.

Duplex immunofluorescence image acquisition and analysis

Whole-slide images were collected on the PhenoImager HT multispectral slide scanner (Akoya Biosciences, v.1.0) using a ×20 objective using Motif mode. Images were spectrally unmixed using Inform (Akoya, v.2.6.0) using an autofluorescence liver control slide to remove autofluorescence.

Tumour scoring

H&E-stained sections and tumours were additionally assessed by a consultant liver histopathologist and UK liver pathology External Quality Assessment scheme member (T.J.K.) working at a national liver transplant centre. All assessment was undertaken blind to all other data, including genotype and sampling times. An initial screen of the first available 135 cases was made to identify prominent histological features in lesional and non-lesional liver that could be semi-quantitatively assessed.

Accepting the inherent limitations of semi-quantitative subjective histological assessment but using a single observer to remove interobserver considerations, semi-quantitative/ordinal scoring systems were created for lesional and non-lesional features. Slides containing transections of whole lobes from each animal were assessed as a whole, giving an overall score or impression rather than scoring on an individual-lesion basis.

Non-lesional liver was scored for steatosis (none, focal, abundant) and lobular inflammation (none, focal, abundant). A minority of slides included insufficient non-lesional liver for assessment.

For lesional assessment, the presence of glandular tumour, that is, meriting designation as adenocarcinoma (none, focal, extensive) and undifferentiated carcinoma (none, focal, abundant, exclusive) was assessed first. All hepatocellular neoplastic lesions had the morphological and cytological appearances of malignancy, that is, HCC. In all cases in which there was HCC, the following features were assessed using the categories in parentheses: lesional pattern (any from nested, trabecular, solid), lesional steatosis (none, focal, abundant), lesional cell ballooning (none, focal, abundant), intralesional inflammation (none, focal, abundant), lesional necrosis (none, focal, confluent, extensive), lesional cell apoptosis (none, focal, many), intralesional peliosis (none, focal, abundant), lesional nuclear grade (low, minimal/low pleomorphism; high, highly pleomorphic).

Whole-tumour RNA-seq

Whole tumour and healthy tissue were snap-frozen and stored at −80 °C. To cover the breadth of our models, for each cohort, tissue from the shortest and longest surviving mouse as well as tissue from mice with survival closest to median cohort survival was chosen. Tissue was homogenized using the Precellys Evolution homogenizer and bulk RNA was isolated using a Trizol (Invitrogen) extraction protocol according to the manufacturer’s instructions. RNA quality and quantity was analysed on the Nanodrop 2000 (Thermo Fisher Scientific) and Agilent 2200 TapeStation (D1000 screentape) systems. Only samples with RIN > 7 were used for library preparation. Libraries were prepared using a Lexogen QuantSeq FWD Kit (disease positioning) or the Illumina TruSeq stranded mRNA kit (tumour heterogeneity). Library quality and quantity were assessed using the 2200 TapeStation (Agilent) and Qubit (Thermo Fisher Scientific) systems. The libraries for the disease positioning were sequenced by Novogene Europe. The libraries for the tumour heterogeneity were run on the Illumina NextSeq 500 system using the high-output 75 cycle kit (2 × 36 cycle paired-end reads).

Mapping of RNA-seq expression data

Quality checks and trimming on the raw RNA-seq data files were done using FastQC v.0.11.9 (https://www.bioinformatics.babraham.ac.uk/projects/fastqc/), FastP (v.0.20.1)58, MultiQC (v.1.9)59 and FastQ Screen (v. 0.14.0)60. RNA-seq single-end reads were mapped to the GRCm39.103 version of the Mus musculus genome and annotated61 using STAR (v.2.7.8a)62. Expression levels were determined by FeatureCounts from the Subread package (v.2.0.1)63.

Computational disease positioning based on human TCGA data

TCGA data were downloaded using the GenomicDataCommons R package (v.1.12.0; https://bioconductor.org/packages/GenomicDataCommons)64, TCGA ‘HTSeq–counts’ and corresponding clinical annotations. TCGA mutational data were downloaded using maftools (v.2.4.2)65. Both human and mouse RNA-seq counts were normalized using VST from the DESeq2 (v.1.28.1 and v.1.44.0) package66 and then centred within a sample. Genes were reduced to those with direct one-to-one gene mapping between human and mouse genomes established by Ensembl, as retrieved from the biomaRt (v.2.56.1) package67,68. Singular-value decomposition (SVD) of the human data was performed followed by matrix factorization of both the human and mouse data into a 100-rank human space. UMAP of the combined dataset was executed using R package uwot (v.0.1.11; https://CRAN.R-project.org/package=uwot). An adjacency matrix was constructed from a nearest-neighbours search (RANN package v.2.6.1, https://CRAN.R-project.org/package=RANN) of the human and mouse SVD objects for clustering analysis. R package igraph (v.1.2.11 and v.2.0.3)69 was used to construct a graph object and the community structure was determined using Louvain •clustering.

Single-sample gene set enrichment analysis (ssGSEA) analysis was performed using the R package corto (v.1.2.4)70 with the Hallmark gene set71,72 downloaded using msigdbr (v.7.4.1; https://CRAN.R-project.org/package=msigdbr). Hoshida27 and Chiang28 (also downloaded using msigdbr) subclass classification was determined by the highest enriched subclass. Tumour immune cell estimation was performed using ConsensusTME73.

Visualization of data by a combination of the ComplexHeatmap (v.2.4.3 and v.2.14.0)74, ggplot2 (v.3.3.6 and v.3.5.1)75, cowplot (v.1.1.1; https://CRAN.R-project.org/package=cowplot) and viridis76 packages.

Human H&E-stained tissue sections were obtained from the TCGA collection (https://portal.gdc.cancer.gov/).

Validation of HuMo clusters in an independent HCC cohort

HuMo clusters were validated with the bulk RNA-seq data of an independent cohort of 171 HCC samples from patients undergoing resection collected in the setting of the HCC Genomic Consortium25 (European Genome-Phenome Archive: EGAS00001005364). Fastq files were aligned using STAR62 (v.2.5.1b) to the hg19 reference genome with gencode annotation v19 and were quantified using featureCounts63 (v.1.5.2). Raw counts were preprocessed and cluster attribution was performed as described above with the TCGA and mouse data. In the analysis of the transcriptomic data, positivity for previously reported gene signatures was evaluated using the Nearest Template Prediction77 module from GenePattern (v.3.9)78. The ssGSEA projection71 was performed using previously reported gene signatures as well as the Hallmark gene set downloaded using msigdbr (v.7.4.1). The mutational profile of 144 HCC samples was obtained by whole-exome sequencing25. Clinicopathological data (such as vascular invasion, AFP levels (≥400 ng ml−1)) were originally reported previously25.

Differential expression analysis for intertumoural heterogeneity

Genes were restricted to those with significance in all comparisons (with significance defined as adjusted P < 0.05 and log2[FC] > 1). Data were scaled and visualized using the ComplexHeatmap74 package. Gene Ontology over-representation analysis was performed using the clusterProfiler79 package (v.3.16.1).

Human sample ethical approval

The use of consenting patients’ tissues surplus to diagnostic requirements for research purposes was approved by the Newcastle and North Tyneside Regional ethics committee, the Newcastle Academic Health Partners Bioresource (NAHPB) and the Newcastle upon Tyne NHS Foundation Trust Research and Development (R&D) department, in accordance with Health Research Authority guidelines. (10/H0906/41; NAHPB Project 48; REC 12/NE/0395; R&D 6579; Human Tissue Act licence 12534).

MRI

Magnetic resonance imaging (MRI) scans were performed on liver-tumour-bearing mice using the nanoScan imaging system (Mediso Medical Imaging Systems). The mice were anaesthetized and maintained under inhaled isoflurane anaesthesia (induction, 4–5% (v/v); maintenance, 1.5–2.0% (v/v)) in 95% oxygen during the entire imaging procedure. Whole-body T1-weighted gradient echo (GRE) 3D coronal/sagittal MRI sequences (echo time (TE), 3.8 ms; repetition time (TR), 20 ms; flip angle, 30 degrees; and slice thickness, 0.50 mm) were used to obtain MRI images. For quantification of scans, volumes of interest were manually drawn around the liver region on MRI scans by visual inspection using VivoQuant software (v.4.0, InviCRO). For each scan, separate volumes of interest were prepared to adjust for the position and angle of each mouse on the MRI scanner and their tumour size.

Mouse HCCO culture, drug screening and imaging

HCCOs were extracted and cultured as previously described31,80, with the exception that HCCOs from mice with activated β-catenin signalling were cultured in the absence of WNT and RSPO1. All mouse HCCO cultures were regularly tested for mycoplasma.

For the high-throughput screen cohort 5 (BM) HCCOs were dissociated with TrypLE and plated at a density of 1 × 103 cells in 10 μl BME in prewarmed 384-well plates (Greiner BioOne, 781091) 5 days before adding the drugs. On day 0, a panel of 147 FDA-approved oncology drugs (AOD IX, acquired June 2019, https://dtp.cancer.gov/organization/dscb/obtaining/available_plates.htm) was added at a final concentration of 10 µM. Staurosporin was used as an internal positive control; DMSO and untreated cells were used as an internal negative control. The medium was changed on day 4 and the compounds were freshly added. Incucyte NucLight Rapid Red (Sartorius, 4717) was added on day 6 and cells were imaged using the Opera Phenix High-Content Screening System (Perkin Elmer) on day 9. Volumes were determined using Icy BioImage software (v.2.0.0.0; https://icy.bioimageanalysis.org)81. The experiment was performed twice (using different passages from one HCCO line) in technical quadruplicates.

For the drug dose–response curve screen, HCCOs (1–2 lines per cohort) were dissociated with TrypLE and plated at a density of 1 × 103 cells in 10 μl Matrigel (Corning, 356231) in prewarmed 96-well plates (Greiner BioOne, 655098). The treatment schedule was the same as for the HTP screen, except the medium was changed and fresh drugs were added on days 3 and 7. Drugs and concentrations are shown in the figures. Drugs were purchased from Selleckchem, dissolved in DMSO to 10 mM, aliquoted and stored at −20 °C. Cell viability was measured on day 9 using CellTitre-Glo 3D reagent (Promega, G9682) according to the manufacturer’s instructions. Luminescence was measured on the Spark Microplate Reader (Tecan). The results were normalized to the vehicle. Curve fitting and IC50 calculation were performed using a nonlinear regression equation. All of the experiments were performed in duplicate and at least three times using different passages from one to two HCCO lines per cohort.

Images of HCCOs were taken on an Olympus CKX41 using the Qimaging Retiga Exi Fast 1394 camera.

For immunofluorescence analysis, HCCOs were washed with ice-cold PBS, fixed with 4% PFA and permeabilized with 0.2% Triton X-100. A list of the antibodies used is provided in Supplementary Table 5. Images were taken using the Zeiss 710 confocal microscope.

Tumour-derived mouse HCCOs (available from all GEMMs) will be shared on reasonable request.

Human HCCO culture and drug screening

Human HCCOs were derived from liver cancer needle biopsies or liver resections as described before32. The following human HCCO lines were used: D386-O and D953-O (CTNNB1 WT, TP53 WT, MYC WT); D455-O (CTNNB1 MUT, TP53 WT, MYC AMP); C948-O and C949-O (CTNNB1 MUT, TP53 WT, MYC AMP); C655-O (CTNNB1 WT, TP53 MUT, MYC ND); D045-O, D046-O, D803-O and R035-O (CTNNB1 WT, TP53 MUT, MYC WT); C798-O, C975-O, D324-O, D804-O and D876-O (CTNNB1 WT, TP53 MUT, MYC AMP); and D359-O (CTNNB1 MUT, TP53 MUT, MYC WT).

For expansion, the human HCCOs were seeded into reduced growth factor BME2 (R&D Systems, 3533-005-02) and cultured in expansion medium (EM): advanced DMEM/F-12 (Gibco, 12634010) supplemented with 1× B-27 (Gibco, 17504001), 1× N-2 (Gibco, 17502001), 10 mM nicotinamide (Sigma-Aldrich, N0636), 1.25 mM N-acetyl-l-cysteine (Sigma-Aldrich, A9165), 10 nM [Leu15]-gastrin (Sigma-Aldrich, G9145), 10 μM forskolin (Tocris, 1099), 5 μM A83-01 (Tocris, 2939), 50 ng ml−1 EGF (Peprotech, AF-100-15), 100 ng ml−1 FGF10 (Peprotech, 100-26), 25 ng ml−1 HGF (Peprotech, 100-39), 10% RSPO1-conditioned medium (v/v, homemade). HCCOs were passaged after dissociation with 0.25% trypsin-EDTA (Gibco). All human HCCOs were regularly tested for mycoplasma contamination using the MycoAlert mycoplasma detection kit (Lonza, LT07-118).

Drugs were purchased from ApexBio and Selleckchem, dissolved in DMSO to 10 mM, aliquoted and stored at −20 °C. For the screening, human HCCOs were dissociated with 0.25% trypsin-EDTA (Gibco) to single cells and 1 × 103 cells per well were plated in a 384-well plate (Greiner BioOne, 781986) on a layer of BME2 (R&D Systems, 3533-005-02) previously diluted with EM (50:50, v/v). Cells were cultured for 3 days without treatment to allow for organoid formation. At day 3, an eight-point half-log dilution series of each compound (ranging from 10 μM to 0.00316 μM) was added using a Tecan D300e. Cell viability was measured after 5 days of treatment using the CellTiter-Glo 3D reagent (Promega, G9682). Luminescence was measured on the Synergy H1 multi-mode reader (BioTek Instruments). Results were normalized to the vehicle (DMSO). The maximal DMSO concentration was 0.2%. Curve fitting was performed using Prism (GraphPad v.9 GraphPad Software) software and the nonlinear regression equation. Results are shown as mean ± s.e.m.

Fluorescent activated cell sorting

After mincing into small pieces, 100 mg of healthy liver or liver tumour was digested on the gentleMACS Octo dissociator with heaters using the mouse tumour dissociation Kit (Miltenyi Biotec, 130-096-730). Dissociated cells were resuspended in 0.5% BSA in PBS, filtered through a 70 μm strainer and centrifuged at 400 g for 5 min. Cells were then resuspended in 5 ml RBC lysis buffer (Thermo Fisher Scientific, 00-4300-54) and incubated for 5 min at room temperature and washed with 0.5% BSA in PBS before being resuspended in 0.5% BSA in PBS. Cell suspensions were added to 96-well V-bottom plates (maximum density, 0.5 × 106 cells per well). Cells were stimulated for 3 h with complete IMDM medium containing 8% FCS, 50 μM β-mercaptoethanol, 1× penicillin–streptomycin with 1× cell activation cocktail with brefeldin A (BioLegend, 423304) at 37 °C as previously described previously82. After stimulation, cells were centrifuged at 800g for 2 min. Cells were stained in Brilliant stain buffer (BD Biosciences, 566349) containing antibodies for surface antigens for 30 min at 4 °C in the dark. Cells were then washed with PBS/0.5% BSA, centrifuged at 800g for 2 min, followed by ice-cold PBS and incubated with Zombie NIR Fixable Viability dye (BioLegend, 423106) to stain dead cells for 20 min at 4 °C. After further washing the cells with PBS/0.5% BSA, cells were fixed and permeabilized in FOXP3 transcription factor fixation/permeabilization solution (Thermo Fisher Scientific) for 20 min at 4 °C, according to the manufacturer’s instructions. Intracellular antibodies were prepared in permeabilization buffer and incubated with cells for 30 min at 4 °C before cells were washed with permeabilization buffer, followed by PBS/0.5% BSA and finally resuspended in PBS/0.5% BSA. All of the experiments were performed using a five-laser BD LSRFortessa flow cytometer with DIVA software (BD Biosciences v.8.0.1). Compensation was determined automatically using Ultracomp eBeads (01-2222-42; Thermo Fisher Scientific). Data were analysed using FlowJo Software v.9.9.6.

Statistics and reproducibility

Statistical analyses were performed using GraphPad Prism, software (v9 GraphPad Software) and R (v.4.0.2 and higher) with statistical tests as indicated in the figure legends. Data were tested for normal distribution. All performed t-tests were two-tailed. P values are displayed in figures. No statistical methods were used to predetermine sample sizes but our sample sizes are similar to those reported in previous publications6,31,83,84,85. For animal experiments, biological replicate sizes were chosen taking into account the variability observed in pilot and previous studies in respective cohorts. For all experiments, animal/sample assignment was matched for age-matched control, and assigned based on randomly assigned mouse identification markings. Batched staining and analyses alongside controls were used throughout. The investigators were not blinded for the in vivo experiments. Technical staff administering therapy were blinded to the mouse genotypes. All subsequent tissue handling and analysis were blinded and/or performed using standardized automated analyses where possible. Quantitative image analysis was performed blinded to the genotype and treatment. The data distribution for normality testing and testing of equal variances was assessed using Prism 9 Software (GraphPad Software). No data were excluded, unless mentioned otherwise, except the following, which were excluded before analysis: one biological replicate failed quality control after transcriptomic sequencing—all other biological replicates from this and other cohorts successfully passed quality control and were included in downstream analysis; two drugs were excluded from the HCCO HTP screen due to microbiological contamination and drug precipitation in multiple replicates, respectively. One sample was excluded from the RFP expression analysis during analysis (total n = 4 biological replicates): testing AAV-mediated recombination of RFP alleles (Extended Data Fig. 1b,c), one sample was a notable outlier (4.9% versus 25.7%, 25.1% and 25.8%) which on re-review was caused by inconsistent RFP staining of the section—this outlier was removed from final analysis; details are provided in the figure legend also. Where the tumour number could not be quantified due to tumour rupture, no tumour number is reported (Fig. 5d).

Figures were assembled using Scribus v.1.4.8 (https://www.scribus.net/). Images were processed using Gimp v.2.10.14 (https://www.gimp.org/).

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.