Figure 4

RNA sequence data from the TCGA dataset. (a), Hierarchical clustering and heatmaps. The top 100 variance genes were used for heatmap generation. The Euclidean distance and average variance methods were used to generate a hierarchical clustering dendrogram. (b), PCA plots presenting clusters of samples based on p16 IHC and Annot-CLAM prediction. p16-negative cases and those incorrectly predicted by the Annot-CLAM model were among the cluster of p16-positive cases.