Fig. 5: Prostate cancer-specific expression of the hCRISPRs in the NGS-ProToCol dataset.

a Unsupervised clustering (Euclidean distances; Ward.D2) of malignant (orange; top-bar) and healthy (blue; top-bar) prostate tissues using normalized read counts (VST) as Z-scores over all 177 statistically significant hCRISPRs. (|log2 fold-change | ≥ 0.5, average read-count over all samples ≥ 10, and q ≤ 0.05). Negative Z-scores are highlighted in green whilst positive Z-scores are highlighted in red. b Volcano-plot depicting the log2 fold-change (x-axis) and adjusted p-value (q) (y-axis; in −log10 scale) of all 12,572 hCRISPRs. The hCRISPRs which were found to be differentially upregulated (log2 fold-change ≤ 0.5, average read-count over all samples ≥ 10 and q ≤ 0.05) and downregulated (|log2 fold-change | ≥ 0.5, average read-count over all samples ≥ 10 and q ≤ 0.05) in cancer are shown by red and blue dots, respectively. The hCRISPR identifier (#Order) is shown for the top 25 most substantial (based on adjusted p) upregulated and downregulated hCRISPRs. c Overview of the stratification of the NGS-ProToCol cohort using principal component analysis (PCA) on the 177 differentially expressed hCRISPRs (using VST-normalized read counts) with the first two principal components (PC1 and PC2). Malignant prostate tissues are depicted by salmon points whilst the normal adjacent to tumor prostate tissues are depicted by blue points. d Boxplots representing the normalized expression (VST-transformed read-count) of the top ten hCRISPRs with an absolute log2 fold-change ≥ 1 and average read counts ≥ 20, ordered on descending q-value; median, Q1 and Q3 are highlighted with a bold black line and error bars, respectively. Malignant prostate tissues are depicted by salmon points whilst the normal adjacent to tumor prostate tissues are depicted by blue points. Normalized expression (VST-transformed read-count; y-axis) is shown in log10 scale.