Fig. 2: Transcriptional gene signatures derived from genetically-distinct HCC models recapitulate human HCC heterogeneity and predict patient prognosis.

a Heatmap of unsupervised hierarchical clustering depicting the global gene expression level as TPMs of genetically-distinct HCCs (MycOE/Trp53KO n = 3, MycOE/PtenKO n = 3, NrasG12D/PtenKO n = 4, NrasG12V/PtenKO n = 3). Top annotations represent the classification of HCC mouse models’ transcriptomic profiles based on the human molecular HCC subtypes64,65,66,67,68. *FDR < 0.01. b Heatmap of unsupervised hierarchical clustering depicting the link between the MAPK, PI3K, and MYC signaling pathway enrichment (Supplementary Fig. 2c) and patients correlating to genetically-distinct HCC-derived transcriptional signatures across TCGA: Liver Hepatocellular Cancer (LIHC) patients (n = 423)103 (Supplementary Data 2). c Kaplan–Meier survival curves displaying TCGA:LIHC patients103 segregated according to their high/low correlation with the transcriptional signatures of each genetically-distinct HCC relative to control. Lines at survival probability = 0.5 depict median survival. Risk tables show the number of patients at the indicated time points (in months). d Representative IHC image for p-ERK1/2 and MYC performed on human HCC TMA sections from the Wu et al. dataset69. Pie charts depict the percentage of patients positive for p-ERK1/2 and MYC. (Scale bars = 100 µm; representative of n = 99 p-ERK1/2 positive patients and n = 78 MYC positive patients). e Kaplan–Meier curves displaying the overall survival (OS) and recurrence-free survival (RFS) of HCC patients (from the Wu et al. dataset69) segregated according to p-ERK1/2 negative (n = 368) or positive (n = 99) and c-MYC negative (n = 389) or positive (n = 78) staining in cancer cells. Risk tables show the number of patients at the indicated time points (in months) (see Supplementary Data 4 for median OS and RFS time). Statistical significance was determined by one-sided Fisher’s test using Bonferroni multiple testing correction (a), log-rank test (c, e). The shading represent 95% confidence interval (c, e). TPMs Transcripts per Million. Source data are provided as a Source Data file.