Extended Data Fig. 2: Activation of additional promoters is associated with gene expression upregulation.

a) The number of active promoters normalized to the number of expressed genes for each individual sample grouped by disease stages (N = 8, 147, 104 for benign, localized and mCRPC). Genes with nonzero counts were considered as expressed. Box plots show data from the 25th to the 75th percentile, with the median as a line inside the box. Whiskers extend to 1.5 times the interquartile range (IQR) from the lower and upper quartiles. b) Upregulated and downregulated genes were identified by differential gene expression analysis. Bar plot shows the percentage of genes in each category that switch between single-promoter active and multiple-promoter active in benign prostate and localized PCa (left) or mCRPC (right). Activated: switch from SP (single-promoter active) in benign to MP (multiple-promoter active) in tumors. Deactivated: switch from MP in benign to SP in tumors. Inactive: SP in both benign and tumors. Constitutively active: MP in both benign and tumors. c) The RNA-seq coverage across gene body from 5’ to 3’ for ten random samples from each of the dataset (PAIR, CPCG, and WCDT) in our data collection. d) The EDASeq bias plot of the positional biases in unnormalized promoter counts of all samples from the RNA-seq datasets (PAIR, CPCG, and WCDT) in our data collection. e) The analysis of number of genes switching from single promoter active in benign prostate to multiple promoters active in localized (left) or mCRPC (right) using the RNA-seq dataset all down-sampled to 80 M reads/sample. SP: single -promoter active, MP: multiple-promoter active. (Fisher’s exact tests, two-sided). f. Principal component analysis of all samples of different disease stages from three cohorts using the down-sampled RNA-seq dataset.