Introduction

In the United States, Black men suffer from a 2 times or greater age-adjusted mortality rate from prostate cancer (PCa) than men of other racial and ethnic groups1. The contributors to this disparity are multifactorial, including societal (e.g., structural and systemic racism), social (e.g., socioeconomic status, educational attainment), neighborhood (e.g., diet, pollution), institutional (e.g., access to care), and individual-level, genetic ancestry-related (e.g., genetics, genomics, physiology) drivers of health2,3,4,5. In this context, other factors may also compound this disparity. For instance, in equal-access health systems, Black men have an increased risk of PCa, are diagnosed at a younger age, and have more frequent disease progression with shorter time to metastasis than White men following treatment, suggesting a more aggressive disease biology6. Recent studies have also reported that Black PCa patients respond better to secondary hormonal therapy, chemotherapy, radiotherapy, and cellular immunotherapy than White PCa patients7,8,9,10,11,12,13,14.

Individual-level, genetic ancestry-related drivers of health, such as single nucleotide variants (SNVs), epigenetics, miRNAs, non-coding RNAs, aggregate gene expression, and RNA splice variants, may also contribute to PCa racial disparities15,16,17,18,19,20. RNA splicing is an essential step in gene expression that enables cells to diversify gene function by generating different protein isoforms21. It has been implicated in all hallmarks of cancer and in cancer disparities22,23. An analysis of the association between alternative RNA splicing and biochemical recurrence of PCa has identified a prognostic signature24.

Comparative analyses of alternative RNA splicing and transcription events (ARSs) in cancers between Black and White patients have identified race- and genetic ancestry-related events20,25. We have previously reported two studies on race- and genetic ancestry-related RNA splicing in PCa. In one study, we were part of a team that initially reported race-related alternative RNA splicing events in PCa by performing an exon array analysis of total RNA isolated from PCa biopsy cores from Black and White patients25. In that study, we reported that the exon 20 skipping event within Phosphatidylinositol-4,5-Bisphosphate 3-Kinase Catalytic Subunit Delta (PIK3CD) was increased in PCa among Black patients compared to White patients, contributed to increased PCa growth, invasion, and metastasis in vitro and in vivo, and was associated with PCa patient survival25. In a follow-up study, we reported race- and genetic ancestry-related RNA splicing and transcription in seven cancer types, including PCa, using publicly available data from The Cancer Genome Atlas (TCGA)20. A number of these events occur in genes relevant to oncogenesis and are also associated with patient survival.

These previous studies using a small number of biopsy samples, first-generation exon arrays, or computational analysis using TCGA indicated that further in-depth analysis of genetic ancestry-related ARSs in PCa has the potential to identify unique targets for the development of new biomarkers and therapeutic agents for diverse populations. To this end, the present GENomics of CAncer DisparitiEs (GENCADE) study collected tumors and paired tumor adjacent normal (TAN) tissues from Black and White patients with high-grade or low-grade PCa undergoing radical prostatectomy. We estimated the genetic ancestry of patients and conducted high-depth RNA seq to analyze the transcriptome at the exon level and its association with signaling pathways and biochemical recurrence (BCR).

Results

Black PCa patients enrolled in GENCADE have higher probability of BCR compared with white PCa patients enrolled in GENCADE

Thirty-nine self-reported Black and 39 self-reported White PCa patients were enrolled in the GENCADE study (Supplementary Tables 1 and 2). Thirty-seven Black patients and 35 White patients who underwent radical prostatectomy (RP) were included in the analysis. Per surgical Gleason score pathology following RP, patients were grouped into those with high-grade tumors (4 + 3, 4 + 4, 4 + 5, and 5 + 5) or those with low-grade tumors (3 + 3 and 3 + 4) (Fig. 1A). None of the enrolled patients had preoperatively documented metastasis, but on RP pathology, a number of patients had regional metastasis to lymph nodes (five Black and two White patients with N1, 20 Black and 21 White patients with N0, and 30 not conducted). Patients with BCR underwent salvage radiation treatment with or without androgen deprivation therapy (ADT) more than 180 days post-prostatectomy or a documented increase in prostate-specific antigen (PSA) level > 0.2 ng/ml. There were differences in the risk of BCR when stratifying by race and PCa grade (log rank; p = 0.031); in comparison to White patients with high-grade PCa, Black patients with high-grade PCa had an increased risk of BCR (Fig. 1B, Supplementary Table 3). Over a median 58-month follow-up period, metastasis was documented among seven patients, four Black and three White patients, all with high-grade PCa. Analysis of the probability of metastasis did not show a significant difference between Black and White patients (Supplementary Fig. 1).

Fig. 1: GENCADE cohort.
figure 1

A Number of PCa patients enrolled in GENCADE by self-reported race and surgical pathology Gleason grade, high-grade or low-grade. B Kaplan-Meier plot showing the BCR for Black and White GENCADE patients. The p value (p) is shown. C Density plots showing age distribution in years (left), BMI (middle), and PSA (right) for Black and White patients in GENCADE. Y-axis shows the density. P values (p) are shown. D Bar plots showing the genetic ancestry composition, African or European, for PCa patients enrolled in GENCADE. NA19238-40 represents the YRI reference population for African ancestry and NA12878, NA12891-2 represent the CEPH reference population for European ancestry.

A comparison of the distribution of patient age at the time of study enrollment among Black and White patients showed that Black patients trended toward being younger at RP than White patients (Fig. 1C, left). Analysis of BMI showed an average BMI of 30.3 (obese range) among Black patients and BMI = 29.7 (overweight range) among White patients (Fig. 1C, middle). The baseline PSA for 51 of the 76 patients (patients whose PSA was documented) was less than 10 ng/ml, for 17 of the 76 patients was between 10 and 20 ng/ml, and for eight of the 76 patients was greater than 20 ng/ml (median = 7.67 ng/ml). We did not observe a significant difference in PSA levels at the time of enrollment between Black and White patients (Fig. 1C, right). Estimation of patient genetic ancestry showed concordance with patient self-reported race (Fig. 1D, Supplementary Table 4). Given that patient estimated genetic ancestry was concordant with patient self-reported race in the GENCADE cohort, hereafter, we refer to ARSs between Black and White patients as genetic ancestry-concordant ARSs. For comparative transcriptomic analysis of PCas and TANs from Black and White patients, specimens were collected, graded by a genitourinary pathologist, and submitted for high-depth RNA seq (Fig. 2).

Fig. 2: Study flow chart.
figure 2

Analyzed specimens from GENCADE PCa patients were either from radical proctectomy (RP) or biopsy (Bx). The tumor and TAN specimens were evaluated by a genitourinary pathologist, including Gleason scoring of tumor specimens. The specimens from which RNA was successfully purified and analyzed are shown in the table. ARS and DE comparative analyses were conducted, as shown below the table.

Identification of ARSs between tumors and TANs among Black and White PCa patients

To investigate the burden of ARSs in PCa, we compared ARSs in tumors to TANs among Black patients and among White patients (Fig. 3). The number of ARSs between tumors and TANs (tumor vs. TAN) among Black patients with high-grade or low-grade PCa was 926 and 1835, respectively (Fig. 3A, Supplementary Table 5). In contrast, the number of ARSs between tumors and TANs (tumor vs. TAN) among White patients with high-grade or low-grade PCa were 2524 and 2382, respectively. In high-grade tumor vs. TAN, 535 ARSs were unique to Black patients and 2128 were unique to White patients, with a minority (387) of overlap among ARSs between Black patients and White patients (Fig. 3B, Supplementary Table 5). In low-grade tumor vs. TAN, 666 ARSs were unique to Black patients and 1210 were unique to White patients, with 1162 ARSs overlapping.

Fig. 3: ARSs and DEs between tumors and TANs from Black patients and from White patients.
figure 3

A Bar plot showing the number of ARSs between tumors and TANs among Black patients or among White patients with high-grade or low-grade PCa. The x-axis shows the Gleason category, high-grade or low-grade, and the y-axis shows the number of ARSs (# of ARSs (tumor vs TAN)). B Venn diagrams showing the number of overlapping ARSs in the indicated comparisons. C Pie charts showing the proportions of ARS types in the indicated comparisons. The ARS types are alternative acceptor (AA), alternative donner (AD), alternative promoter (AP), alternative terminator (AT), exon skipping (ES), mutually exclusive exons (ME), and retained intron (RI). D Bar plot showing the number of DEs between tumors and TANs among Black patients or among White patients with high-grade or low-grade PCa. The x-axis shows the Gleason category, high-grade or low-grade, and the y-axis shows the number of DEs (# of DEs (tumor vs TAN)). E Venn diagrams showing the number of overlapping DEs in the indicated comparisons. F Venn diagrams showing the number of genes undergoing ARSs and DEs in the indicated comparisons.

The proportions of the types of ARSs in tumor vs. TAN among Black patients and among White patients with either high-grade or low-grade tumors were largely similar, with exon skipping (ES): 31–42%, alternative promoter (AP): 22–30%, alternative terminator (AT): 8–13%, and retained intron (RI): 11–13% (Fig. 3C). The proportion of novel ARSs in various comparisons was 5–7% (Supplementary Fig. 2A).

Given that RNA splicing occurs co-transcriptionally and to compare the mechanisms of dysregulation of expression between prostate tumors and TANs, we analyzed the burden of differential aggregate gene expression (DE) between prostate tumors (high-grade or low-grade) and TANs among Black patients and among White patients. The number of genes undergoing DE in tumor vs. TAN among Black patients was 5273 and 4578 in high-grade and low-grade tumors, respectively, and the number of DEs in tumor vs. TAN among White patients was 6126 and 5797 in high-grade and low-grade tumors, respectively (Fig. 3D, Supplementary Table 6). Notably, most DEs in tumor vs. TAN among Black and White patients with high-grade or low-grade PCa were downregulated in tumors compared with TANs, and a large number of DEs in tumor vs. TAN overlapped between Black and White patients with high-grade or low-grade PCa (Fig. 3E). The predominant downregulation of DEs in tumors from Black and White patients may reflect tumor suppressive mechanisms. Specifically, 4730 DEs in high-grade tumor vs. TAN were identified among Black and White patients, whereas 543 and 1396 were unique among Black patients and White patients, respectively. In addition, 4330 DEs in low-grade tumor vs. TAN were identified among Black and White patients, with 248 and 1467 being unique among Black patients and White patients, respectively. These results indicate that unlike the ARSs in tumor vs. TAN among Black patients and White patients, the vast majority of DEs in tumor vs. TAN among Black patients and White patients overlap. Overlapping the genes from ARS and DE analyses showed that 251 and 631 genes undergoing ARSs in tumor vs. TAN among Black patients or among White patients with high-grade PCa, respectively, also exhibited DE, and 382 and 596 genes undergoing ARSs in tumor vs. TAN among Black patients or among White patients with low-grade PCa, respectively, also exhibited DE (Fig. 3F).

Alternative polyadenylation (APA) is a mechanism that generates distinct 3’ ends of transcribed RNAs and has been found to influence gene expression and play a role in cancer26,27. To examine whether genes undergoing D-ARSs or DEs also undergo differential APA events (D-APAs), we identified APAs in tumor vs. TAN among Black patients or among White patients (Supplementary Fig. 6A, Supplementary Table 7). Comparing the genes undergoing DEs and/or ARSs to the genes undergoing APAs, we found that approximately 1–3% of genes undergoing DEs also undergo D-APAs and only a few genes undergoing D-ARSs, 3-25 genes, also undergo D-APAs (Supplementary Fig. 6B). Interestingly, among the genes undergoing D-APAs between tumors and TANs, a number have been reported to undergo D-APAs in PCa post-enzalutamide treatment (Supplementary Fig. 6C)28.

Genes undergoing ARSs in prostate tumor vs TAN among Black and among White patients are enriched in cancer relevant pathways

To determine the enriched pathways among genes undergoing ARSs or DEs between tumors and TANs unique to Black or White patients with high-grade or low-grade PCa, we conducted Gene Set Enrichment Analysis (GSEA)29. GSEA revealed that genes undergoing ARSs in tumor vs. TAN among Black patients or among White patients with high-grade PCa were enriched in pathways whose important role in cancer has been extensively reported, such as PI3K AKT MTOR signaling, G2M checkpoint, and IL6 JAK STAT3 signaling (Fig. 4A).

Fig. 4: GSEA for genes undergoing ARSs or DEs between tumors and TANs among Black patients and among White patients.
figure 4

Point plots showing the hallmark pathways in the molecular signature database (MSigDB) in which genes undergoing ARSs (A) or DEs (B) in the indicated comparisons are enriched. The x-axes show the normalized enrichment score (NES) and the y-axes show the pathway name. The size of the point corresponds to the number of genes undergoing ARSs (A) or DEs (B).

Similarly, GSEA revealed that DEs in tumor vs. TAN among Black patients or among White patients with high-grade or low-grade PCa were enriched in cancer-related pathways, such as KRAS signaling, Epithelial Mesenchymal Transition, WNT Beta Catenin signaling, IL6 JAK STAT3 signaling, and Androgen Response (Fig. 4B).

Additionally, we found that pathways enriched among overlapping genes undergoing ARSs or DEs in tumor vs. TAN among Black and White patients with high-grade or low-grade PCa were largely different from pathways enriched among unique genes undergoing ARSs or DEs in tumor vs. TAN among Black patients or among White patients with high-grade or low-grade PCa (Supplementary Fig. 3A and 3B).

Cancer promoting genes undergo ARSs in tumors and TANs between Black and White PCa patients

To investigate genetic ancestry-concordant ARSs in prostate tumors and TANs, we compared ARSs in tumors and TANs between Black and White PCa patients (Fig. 5). A total of 405 ARSs were identified in high-grade tumors and 253 in low-grade tumors (Fig. 5A). Therefore, a larger number of ARSs were identified in tumors between Black and White patients with high-grade PCa. Surprisingly, we found a slightly higher number of ARSs in TANs, with 459 in TANs associated with high-grade tumors and 341 in TANs associated with low-grade tumors. It is well known that tumor-free TANs are not entirely normal prostate tissue and likely contribute to the tumor microenvironment30. This raises the possibility that these ARSs in TANs have the potential to contribute to the differential PCa incidence, development, and progression between Black patients and White patients. Among the ARSs, nine overlapped between high-grade and low-grade tumors, and 27 overlapped between TANs associated with high-grade tumors and TANs associated with low-grade tumors (Fig. 5B, top). Among the ARSs identified in high-grade tumors, 25 overlapped between high-grade tumors and TANs and 40 overlapped between low-grade tumors and TANs (Fig. 5B, bottom). These results showed that the majority of genetic ancestry-concordant ARSs in high-grade vs. low-grade tumors and TANs and in tumor vs. TAN did not overlap, indicating that these ARSs were largely specific to the Gleason category, high-grade or low-grade, and to tissue type, tumor or TAN.

Fig. 5: Genetic ancestry-concordant ARSs and DEs in tumors and TANs.
figure 5

A Bar plot showing the number of ARSs between Black and White patients in high-grade or low-grade tumors or TANs. The x-axis shows the Gleason category, high-grade or low-grade, and the y-axis shows the number of ARSs (# of ARSs (Black vs White)). B Venn diagrams showing the number of overlapping ARSs in the indicated comparisons. C Pie charts showing the proportions of ARS types in the indicated comparisons. D Box plots showing the PSI of selected ARSs between Black patients and White patients in high-grade tumors (top) or low-grade tumors (bottom). The x-axis shows the ARSs as gene names followed by ARS types followed by exon number(s) involved and the y-axis shows the PSI. Each dot represents a patient. The boxes represent data between the 25th and 75th percentile, and the lines in the middle represent the medians. E Bar plot showing the number of DEs between Black and White patients in high-grade or low-grade tumors or TANs. The x-axis shows the Gleason category, high-grade or low-grade, and the y-axis shows the number of DEs (# of DEs (Black vs White)). F Venn diagrams showing the number of overlapping DEs in the indicated comparisons. G Venn diagrams showing the number of genes undergoing ARSs and DEs in the indicated comparisons.

The proportions of the types of ARSs showed that the ES type was predominant (54–69%), compared with AP: 12–16%, AT: 3–6%, and RI: 2–6% (Fig. 5C). Thus, the proportions of types of genetic ancestry-concordant ARSs differed from the proportions of types of ARSs in tumor vs.TAN (Fig. 3C). This suggests that the mechanism(s) leading to genetic ancestry-concordant ARSs may differ from those leading to ARSs between tumors and TANs. The proportion of novel ARSs in the various comparisons was 7–14% (Supplementary Fig. 2B). Thus, we noted a slight increase in novel events among genetic ancestry-concordant ARSs compared to ARSs identified in tumor vs. TAN.

To investigate the function of genes undergoing these ARSs, we searched for these genes in the molecular signatures database hallmark and oncogenic gene sets29,31. Genes undergoing genetic ancestry-concordant ARSs in PCa with the hallmark and oncogenic gene sets to which they belong are shown in Table 1 and Supplementary Table 8, respectively. Among these are genes that have been previously reported in PCa, including the following: FGFR2, FGFR3, H6PD, MDM2, RAD51D, CDK7, VEGFA, and XRCC4 (Supplementary Table 9). Examples of the ARSs in these genes in high-grade tumors are: exons 15 | 16 ME event in FGFR2, exons 8 | 9 ME event in FGFR3, exon 1 AP event in H6PD, exon 5.1:5.2:6:7:8.1:8.2:10:11:12.1:12.2 ES event in MDM2, and exon 6:7 ES event in RAD51D. Examples of the ARSs in these genes in low-grade tumors were the exon 3:4 ES event in CDK7, exon 6:7.1:7.2:7.3:8.1:8.2 ES event in VEGFA, and exon 8.1 AA event in XRCC4 (Fig. 5D). Interestingly, we also found a number of genes undergoing ARSs in tumors between Black and White patients with high-grade or low-grade PCa that have not been extensively investigated in PCa, such as ABLIM3, MYBPC1, CPNE2, DLG1, DLG4, and SLFN12. However, all of these genes have the potential to play important roles in PCa progression. For example, ABLIM3 belongs to the LIM domain proteins, which have been implicated in the onset and progression of several cancers32. Another example is SLFN12, whose paralog SLFN11 has been found to play a role in DNA damage repair33.

Table 1 List of genes undergoing ARSs in PCa between Black and White patients and functioning in tumorigenic pathways

As part of the research team of Wang et al., we previously reported race-related alternative RNA splicing events in PCa by performing an exon array analysis of total RNA isolated from PCa biopsy cores from 20 Black patients and 15 White patients (Supplementary Fig. 4A)25. Compared with the present study, the Wang et al. study was a smaller cohort, total RNA was purified from biopsy tissues (not prostatectomy tissues), an exon array (not RNA seq) was conducted, and the analysis was not done in the context of Gleason score. Despite these differences, to compare the results from the Wang et al. analysis with the present study, we first excluded the genes that were not annotated in either study. Second, we identified ARSs between Black and White GENCADE patients in tumors, regardless of the Gleason score. It is also important to note that, in the present study, we used |∆PSI | > 0.15 and p value < 0.05 to define ARSs, whereas in the Wang et al. study, only p value < 0.05 was applied. This comparative analysis identified 1,580 ARSs between Black and White GENCADE patients in tumors across all Gleason scores. Among these ARSs, 161 overlapping genes were found to undergo genetic ancestry-concordant ARSs in PCa across all Gleason scores in GENCADE and race-related alternative RNA splicing in the study by Wang et al. (Supplementary Fig. 4B, Supplementary Table 10).

Our previous study of race- and genetic ancestry-related alternative RNA splicing and transcription in PCa using TCGA was a larger cohort with more distributed Gleason scores; however, it was still limited with respect to the depth of the RNA seq that was conducted/available and the lack of paired TANs20. The number of patients with PCa in TCGA with available Gleason scores was 262 (Supplementary Fig. 4C). Samples with Gleason scores of 3 + 5, 4 + 3, 4 + 4, 4 + 5, 5 + 3, 5 + 4, and 5 + 5 were grouped as high-grade, whereas those with Gleason scores of 2 + 4, 3 + 3, and 3 + 4 were grouped as low-grade (Supplementary Fig. 4D). The number of TCGA PCa samples analyzed was as follows: 33 Black patients (16 high-grade and 17 low-grade) and 229 White patients (131 high-grade and 98 low-grade). To compare the ARSs in PCa between Black and White patients between GENCADE and our previous TCGA-based analysis, only the ARSs annotated in both studies were compared: for GENCADE, 198 were high-grade and 90 were low-grade and, for TCGA, 188 were high-grade and 203 were low-grade. Between GENCADE and our previous TCGA-based analysis, we found 16 overlapping ARSs between Black and White patients in high-grade PCa and 13 in low-grade PCa (Supplementary Fig. 4E, top and Supplementary Table 10). In addition, between GENCADE and our previous TCGA-based analysis, the number of overlapping genes undergoing ARSs in tumors between Black and White patients with high-grade or low-grade PCa were 17 and 15, respectively (Supplementary Fig. 4E, bottom and Supplementary Table 10).

Majority of genes undergoing genetic ancestry-concordant ARSs do not also undergo differential aggregate expression or alternative polyadenylation

Next, we identified genetic ancestry-concordant DEs in PCa and TANs. We found 301 and 83 DEs in high-grade and low-grade tumors, respectively, and 268 and 174 DEs in the TANs associated with high-grade and low-grade tumors, respectively (Fig. 5E, Supplementary Table 6). The upregulated and downregulated DEs may reflect oncogenic and tumor suppressive mechanisms, respectively. Twenty-four DEs were shared between high-grade and low-grade tumors, and 53 DEs were shared between TANs associated with high-grade and low-grade tumors (Fig. 5F, top). Seventy DEs overlapped between high-grade tumors and TANs and 26 DEs overlapped between low-grade tumors and TANs (Fig. 5F, bottom). Thus, similar to ARSs, DEs largely do not overlap between tumors and TANs and between high-grade and low-grade tumors. Among the DEs in tumors between Black and White patients with high-grade PCa, we identified the following genes that have been reported to play a role in PCa: FGF10, FAS, SPHK2, CAMK2B, ERG, KLK11, NRG4, SEMA3E, ERAP2, and ABAT (Supplementary Table 9). Interestingly, we found that ERG was 2.5-fold (adj p value = 3.1×10−14) lower in tumors from Black patients compared to White patients with high-grade PCa (Supplementary Table 6), which was similarly reported by Farrell et al. in 201434. Consistent with the known increased levels of ERG in PCa35, we found that ERG was 3.5- and 8-fold higher in high-grade and low-grade tumor vs. TAN, respectively, among Black patients. Moreover, ERG is 8-fold higher in both high-grade and low-grade tumor vs. TAN among White patients. Additionally, we found that PTEN was 2-fold lower in tumor vs. TAN among White patients with high-grade or low-grade PCa and 1.7-fold lower in tumor vs. TAN among Black patients with high-grade or low-grade PCa. The decreased levels of PTEN in our analysis were consistent with the previously reported tendency of PTEN loss in PCa36.

A comparison of the genes undergoing ARSs and those undergoing DEs showed that, in contrast to genes undergoing ARSs and DEs between tumors and TANs, a much smaller proportion of genes undergoing ARSs in tumors and TANs also underwent DEs, with only eight of 301 genes undergoing ARSs and DEs in high-grade tumors, six of 83 genes undergoing ARSs and DEs in low-grade tumors, three of 269 genes undergoing ARSs and DEs in TANs associated with high-grade tumors, and six of 174 genes undergoing ARSs and DEs in TANs associated with low-grade tumors (Fig. 5G). Therefore, most genes undergoing ARSs in tumors or TANs do not undergo DEs.

To examine whether genes undergoing genetic ancestry-concordant ARSs and/or DEs in tumors and in TANs also undergo D-APAs, we identified genetic ancestry-concordant APAs in tumors and in TANs between Black and White patients with high-grade or low-grade PCa (Supplementary Fig. 6D, Supplementary Table 7). We found that only one gene undergoing genetic ancestry-concordant DE in high-grade tumors also underwent genetic ancestry-concordant APA and only one gene undergoing genetic ancestry-concordant ARS in high-grade tumors underwent genetic ancestry-concordant APA. In low-grade tumors, we didn’t find any genes undergoing genetic ancestry-concordant DEs or ARSs also undergoing genetic ancestry-concordant APAs. Similarly, the genes undergoing genetic ancestry-concordant DEs and/or ARSs in high-grade or low-grade TANs did not also undergo genetic ancestry-concordant APAs (Supplementary Fig. 6E).

Genes undergoing genetic ancestry-concordant ARSs in prostate tumors are enriched in cancer relevant pathways, including IL6 JAK STAT3 signaling, KRAS signaling, and P53

Gene set enrichment analysis (GSEA) of genes undergoing genetic ancestry-concordant ARSs in PCa also revealed enrichment in cancer-related pathways, such as IL6 JAK STAT3 Signaling and P53 (Fig. 6A). Likewise, genetic ancestry-concordant DEs in PCa were enriched in cancer-related pathways (Fig. 6B). Interestingly, genes undergoing ARSs were enriched in different cancer-related pathways from those enriched among genes undergoing DEs; therefore, analyzing ARSs in PCas and TANs revealed pathways important for PCa that would not have been found by DE analysis alone. The pathways to which genes undergoing ARSs belong may contribute to disparities in PCa.

Fig. 6: GSEA for genes undergoing genetic ancestry-concordant ARSs or DEs in tumors.
figure 6

Point plots showing the hallmark pathways in the MSigDB in which genes undergoing genetic ancestry-concordant ARSs (A) or DEs (B) in the indicated comparisons are enriched. The x-axis shows the normalized enrichment score (NES) and the y-axis shows the pathway name. The size of the point corresponds to the number of genes undergoing genetic ancestry-concordant ARSs (A) or DEs (B). C Box plots showing PSI of selected genetic ancestry-concordant ARSs in high-grade tumors (top) or low-grade tumors (bottom) within genes belonging to the indicated pathway. The x-axis shows the ARSs as gene names followed by ARS types followed by exon number(s) involved and the y-axis shows the PSI.

Genes undergoing ARSs in high-grade tumors and enriched in the KRAS Signaling Down pathway were MFSD6, FGFR3, DTNB, THNSL2, SLC25A23, and ZNF112, in the IL6 JAK STAT3 Signaling pathway were MYD88, and in the Estrogen Response Early pathway were RHOD and CD44 (Fig. 6C, top). The genes undergoing ARSs in low-grade tumors and belonging to the IL6 JAK STAT3 Signaling pathway were MYD88, in the Apical Junction pathway were RSU1 and YWHAH, in the Estrogen Response Early pathway was CD44, and in the P53 pathway was CYFIP2 (Fig. 6C, bottom).

Genetic ancestry-concordant ARSs in tumors and TANs were associated with biochemical recurrence among GENCADE patients

To determine the potential clinical relevance of genetic ancestry-concordant ARSs in tumors, we analyzed the association between these ARSs and the probability of BCR. In this analysis, we defined a significant association between PSI and BCR as a hazard ratio greater than 2 (HR > 2) and a p value < 0.05. Two examples of ARSs in high-grade tumors associated with BCR were the exon 4.1:4.2 | 2 ME event in MLF1 and the exon 1 AP event in CHL1 (Fig. 7A). An additional five ARSs in high-grade tumors were associated with BCR (Supplementary Fig. 5A). One ARS in low-grade tumors that were associated with BCR was the exons 3 and 4 ME event in ZBTB17 (Fig. 7B). Interestingly, a number of ARSs identified in TANs were associated with BCR. Among the ARSs in TANs associated with high-grade tumors associated with BCR were the 2 | 3 ME event in ABCD4 and exon 3:4 ES event in CNOT8 (Fig. 7C). An additional six ARSs in high-grade TANs were associated with BCR (Supplementary Fig. 5B). Among the ARSs in TANs associated with low-grade tumors that were associated with BCR were the exon 15:16.1:16.2 ES event in NF2 and exon 1 AP event in AKAP5 (Fig. 7D). An additional four ARSs in low-grade TANs were associated with BCR (Supplementary Fig. 5C). Notably, the vast majority of the aforementioned ARSs that were associated with BCR exhibited PSI changes among Black patients predicting poor prognosis. For example, high PSI for the exon 4.1:4.2 | 2 ME event in MLF1, which was differential in tumors between Black and White patients with high-grade PCa was associated with an increased probability of BCR and this event exhibited increased PSI in tumors among Black patients compared to White patients with high-grade PCa (Fig. 7A).

Fig. 7: BCR analysis for the association between selected genetic ancestry-concordant ARSs and the probability of BCR.
figure 7

Kaplan-Meier plots showing the association between the genetic ancestry-concordant ARSs identified in high-grade tumors (A), low-grade tumors (B), TANs associated with high-grade tumors (C), and TANs associated with low-grade tumors (D) with the BCR. The y-axis shows the probability of free BCR and the x-axis represents time in months. Patients were grouped into three groups: 1) PSI larger than the 75th percentile, 2) PSI smaller than the 25th percentile, and 3) PSI between the 75th and 25th percentiles. The box plots under the Kaplan-Meier plots show the PSI of the indicated ARSs between Black patients and White patients. The x-axis shows the PSI level and the y-axis indicates the self-reported race of the patient. Each dot represents a patient.

Discussion

Alternative RNA splicing has been shown to affect gene function through its production of functionally diverse variants of the same gene, including in disease37,38. In cancer biology, the resulting translated protein isoforms have been reported to contribute to tumor cell proliferation, apoptosis, metastasis, and drug resistance (Fig. 8A)22,39. The present prospective study analyzed ARSs between prostate tumors and TANs among Black and among White patients and between Black and White patients. These comparative analyses identified ARSs that have the potential to contribute to increased BCR and mortality among Black PCa patients. Among the genes undergoing genetic ancestry-concordant ARSs, we identified MLF1 and CHL1 (in high-grade PCa), ZBTB17 (in low-grade PCa), ABCD4 and CNOT8 (in high-grade TANs), and NF2 and AKAP5 (in low-grade TANs). The genetic ancestry-concordant ARSs of these genes were associated with BCR in PCa patients as single events. Importantly, each of these genes has been reported to function in mechanisms that have the potential to influence cancer biology (Fig. 8B–H)40,41,42,43,44,45,46,47. For example, as a result of the findings herein, the 4.1:4.2 | 2 ME event in MLF1, which has been previously reported to promote proliferation and decrease apoptosis in lung adenocarcinoma cells has now been identified in the context of PCa and patient genetic ancestry48. With respect to the alternative promoter event 1 AP in CHL1 identified herein, CHL1 has previously been reported to be involved in the development of different cancers49. Likewise, with respect to the 15:16.1:16.2 ES event in NF2 identified herein, we had recently reported the same exon skipping event in NF2 was enriched and correlated with poor progression in patients with hepatocellular carcinoma46. As a result of these findings, this alternative RNA splicing event has now been identified in the context of PCa and patient genetic ancestry. To our knowledge, prior to the findings herein, ABCD4 and CNOT8 have not yet been reported to play a role in PCa or PCa progression. These genes have been reported to function in mechanisms that have the potential to influence cancer biology, with mutations in the ATPase domain of ABCD4 leading to errors in vitamin B12 metabolism and CNOT8 modulating cell proliferation50,51. Taken together, the identified ARSs in the above genes have the potential to modulate PCa cell biology resulting in increased proliferation, metastasis, and drug resistance. Thus, future studies to elucidate the specific mechanism by which each of these genes and corresponding ARSs influence PCa cell biology and BCR are warranted.

Fig. 8: Graphical illustration of potential models whereby genes undergoing genetic ancestry-concordant D-ARS associated with BCR could influence cancer biology.
figure 8

A Schematic diagram illustrating the role of ARS in oncogenesis. Potential models of oncogenic function for events involving MLF1 (B), CHL1 (C), ZBTB17 (D), ABCD4 (E), CNOT8 (F), NF2 (G), and AKAP5 (H). Ub Ubiquitin, P phosphorylation, and β 2-AR Beta-2 adrenergic receptor.

Notably, in addition to the genes we identified to undergo genetic ancestry-concordant ARS associated with BCR, among the larger set of genes we identified to undergo genetic ancestry-concordant ARSs in tumors and TANs were a number of genes known to play roles in cellular processes relevant to oncogenesis. For example, FGFR2 and FGFR3 have been reported to drive cell migration and invasion52. Importantly, it has been shown that alternative RNA splicing of FGFR modulates its functions53. Moreover, the genetic ancestry-concordant ARS of FGFR3 involving the mutual exclusion of exons 8 and 9 identified herein is known to result in different FGFR3 isoforms binding to different FGF ligands and plays a role in the pathophysiology of malignancies54. In addition, MDM2, PP2A, SMARCA2, and SLFN12 have been reported to play roles in cell cycle regulation and proliferation55,56,57,58. Although these genes were previously reported to play roles in cellular processes relevant to oncogenesis, the findings herein identified unique genetic ancestry-concordant ARSs within these genes having biological and clinical relevance to cancer. In addition to identifying genes that undergo genetic ancestry-concordant ARSs that have already been implicated in cancer, we also identified a number of genes whose role in cancer and cancer disparities has, to our knowledge, not been reported, such as WIZ. Loss of WIZ leads to cleft palate through its effect on histone methylation59.

The genetic ancestry-concordant ARSs identified in high-grade tumors generally did not overlap with those identified in low-grade tumors, suggesting that these ARSs can be exclusive to high-grade or low-grade PCa. The identification of unique ancestry-concordant ARSs in high and low Gleason PCa likely reflect differential biology in these tumors and may have implications for differences in response to therapy and choice of therapy for these patients. With respect to distinct genetic ancestry-concordant ARSs in low-grade tumors, it is interesting to note that a previous analysis of mortality among Black and White PCa patients revealed a significantly higher risk of PCa death among Black patients with PCa with Gleason score 6 compared with White patients, but not among Black patients with PCa with Gleason score 7–10 compared with White patients60.

Comparing DEs and APAs between Black and White patients in high-grade or low-grade tumors or TANs identified fewer genes undergoing genetic ancestry-concordant DEs or APAs than genes undergoing genetic ancestry-concordant ARSs. It is important to note that this could reflect differences in the statistical analyses used to identify ARSs, DEs, and APAs. We found that most genes undergoing ARSs between tumors and TANs or between Black and White patients did not undergo DE or APA. This underscores that analysis of DE alone is insufficient to understand the cancer proteome, and that it is the splicing transcriptome that defines the functional proteome.

Interestingly, we identified ARSs between Black and White patients in TANs associated with high-grade and low-grade tumors, suggesting that ARSs in TANs may also contribute to PCa biology and PCa disparities. We hypothesize that genetic ancestry-concordant ARSs in TANs may contribute to PCa and PCa progression by influencing and interacting with the tumor microenvironment.

Despite differences in cohort sizes, proportions of Black and White patients, distributions of Gleason scores, specimen types, and methodologies between our analysis of GENCADE and our previous analysis using the PCa TCGA dataset, the GENCADE study validated a number of genetic ancestry-concordant ARSs. Among these events were ULK3 AD 14.2, TMC5 AP 2, NEK3 AA 12.1, and XRCC4 AA 8.1. Similarly, despite the aforementioned differences as well as differences in genomic profiling platform and analysis between our analysis herein of GENCADE and the previous analysis we were a part of reported by Wang et al.25, the GENCADE study also validated a number of the genes undergoing genetic ancestry-concordant alternative RNA splicing events. Among these genes were ABCD3, ARHGAP8, and CD44. Additionally, a recently published work reporting a comprehensive analysis of ARSs in PCa vs. TAN, using three different gene expression datasets, identified a prognostic signature of ARSs in six genes: CYP4F12, NFATC4, PIGO, CYP3A5, ALS2CL, and FXYD324. Herein, we found that the exact ARSs of CYP4F12, NFATC4, and FXYD3 were also identified between tumor vs. TAN among Black patients and/or among White patients and the exact ARSs of CYP3A5 were also identified as genetic ancestry-concordant ARSs in both TANs associated with low-grade and high-grade tumors.

Our study had several limitations. The size of this analysis, with 39 patients per racial group and smaller proportions by Gleason grade, required us to dichotomize our samples into high and low risk groups. With greater numbers we may have been able to identify further distinctions between more homogeneously defined grade groups (i.e. Gleason 5 + 5). However, distinct patterns were observed. Second, our follow-up period allowed for an association with BCR as an outcome. Although clinically significant in terms of management, not all patients with BCR have lethal disease. Longer follow-up and larger sample sizes would allow for more robust clinical outcomes, including time to distant metastasis and overall survival.

Going forward, diagnostic and prognostic panels for PCa could incorporate ARSs, including genetic ancestry-concordant ARSs. Specific ARSs have the potential to be targeted for therapeutic application using splice-switching oligonucleotides (SSOs)61 or small molecules that specifically target individual proteoforms produced by the translation of alternative RNA splice variants. Additionally, a deeper genetic ancestry analysis at the individual event level will allow a deeper understanding of genetic ancestry-concordant ARSs.

Black patients continue to suffer disproportionately from PCa with respect to risk, progression, and mortality compared to patients of other racial and ethnic groups1,62,63. In addition, several reports have shown that despite often presenting with worse disease, Black patients participating in clinical trials for PCa have demonstrated better responses to particular PCa therapeutic regimens than White patients6,7. Therefore, there is an urgent need to equally represent Black patients in translational and clinical PCa research studies and to further study ARSs and genetic ancestry-concordant ARSs in the context of therapeutic response as well as their interactions with additional drivers of health contributing to PCa disparities. Understanding the full spectrum of drivers of PCa disparities, including ancestry-related genomics and biology, has the potential to aid in the development of new biomarkers and therapeutic agents for precision oncology interventions, and to optimize the timing and selection of therapeutic regimens for patients to improve PCa outcomes and mitigate PCa disparities.

Methods

GENomics of CAncer DisparitiEs (GENCADE) cohort

Patients diagnosed with PCa who were scheduled to undergo radical prostatectomy or biopsy were enrolled in the GENCADE study, according to the Institutional Review Board (IRB)-approved protocol Pro00053535 (NCT02229565). With respect to NCT02229565, the descriptors “African American” and “Caucasian American” refer to the same patients described as “Black” and “White”, respectively, herein. Patients were also co-enrolled to the Duke BioRepository & Precision Pathology Center (BRPC) protocol (Pro00035974) to facilitate tissue collection. Patients were recruited from 2014 to 2019. The inclusion criteria included 1) suspected malignancy of PCa, 2) self-reported race of Black or White, 3) planned availability of additional tissue collected from inclusion in IRB-approved biorepository protocol Pro000359774, including snap frozen tissue, 4) age >= 18 and 5) able to read, understand and sign an informed consent. The exclusion criteria included 1) collected tumors tissue is inadequate for DNA and RNA analysis and/or is not positive for adenocarcinoma of the prostate and 2) patients with prior systemic therapy, i.e., radiation, chemotherapy or immunotherapy. Age, body mass index (BMI), comorbidities, treatments and serum PSA levels were recorded at the time of enrollment. The recorded comorbidities were hypertension and diabetes mellitus type 1 and 2. Patients were followed until August 2023 to record PSA level following radical prostatectomy and to document the assessment of imaging tests to detect any potential metastases.

GENCADE specimens by type, Gleason score, and patient self-reported race

Snap frozen tissue specimens were collected from 66 PCa patients undergoing radical prostatectomy and six patients undergoing PCa biopsy. For each tumor specimen, areas of 60% or greater prostate tumor content were identified and graded by a genitourinary pathologist using Gleason scoring. The TAN specimens, which are 100% tumor-free and contain 60% or greater prostate benign epithelial vs. stroma content, were also identified by a genitourinary pathologist. The number of PCa specimens were 35 tumors and 36 (35 + 1) TANs (including 33 paired tumor-TAN) from Black patients and 36 tumors and 34 TANs (including 31 paired tumor-TAN) from White patients. According to the Gleason score, the specimens were categorized as low-grade (3 + 3 and 3 + 4) and high-grade (4 + 3, 4 + 4, 4 + 5, and 5 + 5). The number of specimens in the high-grade group was 21, including nine from Black patients and 11 from White patients, and 51 in the low-grade group, including 26 from Black patients and 25 from White patients.

Genomic DNA purification and estimation of patient genetic ancestry

Whole blood specimens from 78 patients enrolled in GENCADE were collected at baseline, and genomic DNA was purified using the PreAnalytix PAXgene Blood DNA Kit (REF#: 762164). Patient genetic ancestry was estimated using the Illumina Infinium Multi-Ethnic Global-8 BeadChip. The IDAT files were converted to PED files using the Illumina Array Analysis Platform (IAAP) Genotyping CLI v1.1 (May 3, 2019) software, Multi-EthnicGlobal D1 manifest and Multi-EthnicGlobal D1 ClusterFile. PED files were merged, outputted in BED format, and preprocessed using PLINK 1.964. Quality control and filtering were done using PLINK with the following arguments “--Geno 0.1”, “--mind 0.1”, “--map 0.05”. This filtering kept 699,934 of 1,748,250 single nucleotide polymorphisms (SNPs). Also, two samples belonging to White patients and one sample belonging to a Black patient were dropped due to inadequate data quality. Genetic ancestry estimation using the SNP data was done by ADMIXTURE65 with K value = 2. Haplotypes samples, Centre d’Etude du Polymorphisme Humain (CEPH) CEU (Utah Residents with Northern and Western European Ancestry) and YRI (Yoruba in Ibadan, NigeriA), were obtained from the Coriell Institute for Medical Research and included in our analysis to represent the indicated reference populations.

Total RNA purification

Snap frozen prostate tissues were evaluated by a genitourinary pathologist to locate areas of 60% or greater tumors and paired areas of 60% or greater benign epithelial tissue. Punch biopsies from such tumors and TANs areas were done and used to purify total RNA using the Qiagen Allprep DNA/RNA Mini Kit (REF#: 80234). The yield, purity and integrity of the total RNA was determined using the NanoDrop 260:280 ratio and DV200.

RNA sequencing

cDNA libraries were built using 200 ng of the total RNA and the Illumina Stranded Total RNA Prep with Ribo-Zero Plus (Illumina cat. 20040529). Libraries were pooled to equimolar concentration and sequenced on the NovaSeq 6000 S4 flow cell to produce 150 bp paired end reads targeting 100 million reads per sample.

RNA splicing and aggregate gene expression analysis

The quality of the raw read data generated by RNA seq was analyzed using FastQC 0.11.766. The low-quality reads, reads with adapters and too long or too short reads were trimmed and cleaned using Trim Galore 0.4.367. For RNA splicing analysis, the trimmed reads from RNA seq were analyzed using SpliceSeq 2.168. SpliceSeq employs Bowtie for read alignment. The analysis using SpliceSeq for each single sample was done using the default setting. The Percent Spliced In (PSI) for the annotated ARSs in every sample were calculated and reported by SpliceSeq. To compare the level of a given ARS between tumors and TANs of high-grade or low-grade PCa among Black patients and White patients, we used Wilcoxon signed-rank test. For aggregate gene expression analysis, the trimmed reads were aligned using STAR 2.7.5a two steps and the aligned reads were counted using HTSeq 2.0.269,70. Read counts were normalized and comparative differential gene expression analysis was conducted using edgeR 3.38.471. Adjusted p values were calculated using the Benjamini-Hochberg method72.

Comparative ARS and DE analysis

To identify ARSs, we compared the level of ARSs between tumors and TANs among Black patients or among White patients with high-grade or low-grade PCa and between Black and White patients with high-grade or low-grade tumors or TANs. The SpliceSeq pipeline was used to generate splice graphs and unambiguously align RNA seq reads to these graphs, with SpliceSeq capable of accurately identifying and quantifying large and complex transcript variants68. This analysis annotated 81,084 ARSs. Only ARSs quantified in at least half of the samples and occurring in genes with aggregate expression greater than 2 counts per million (CPM) were included in downstream analyses. Five types of ARSs were analyzed, including: exon skipping (ES), alternative acceptor (AA), alternative donor (AD), alternative promoter (AP), alternative terminator (AT) and mutually exclusive (ME). Importantly, ARSs includes AP and AT events, which are driven by splicing mechanism, but by transcription initiation and termination mechanisms. The ES type, across all comparisons, composed the largest number of ARSs, followed by the AP and AT types (Supplementary Fig. 2A). The average proportions of ARS types in all analyzed samples were as follows: ES: 50.4%, AP: 17.5%, AT: 11.5%, RI: 4.9%, AA: 7.4%, AD: 6.5% and ME: 1.8%. For further analyses, as described below, we defined differential ARSs between two groups of samples as exhibiting a difference in PSI greater than 0.15 (∆PSI > 0.15) and p value less than 0.05 (p value < 0.05) (Supplementary Fig. 2B). In addition to quantifying ARSs, we also quantified aggregate gene expression. For further analyses, as described below, we only included genes with aggregate expression greater than 2 CPM in our comparative analyses. We defined ARSs between two groups of samples as exhibiting a difference in PSI greater than 0.15 (∆PSI > 0.15) and p value less than 0.05 (p value < 0.05) and we defined DEs between two groups of samples as exhibiting fold change (FC) differences greater than 1.5-fold (∆FC > 1.5) and adjusted p values less than 0.05 (adj p value < 0.05).

Alternative polyadenylation analysis

To quantify the differential alternative polyadenylation events in GENCADE samples, we utilized the REPAC R Package with the RNAseq reads73. The trimmed reads were aligned using STAR 2.7.5a and Bam files with the human hg38 PAS sites and were processed by the rse_from_bam function with the default arguments. The annotated and quantified alternative polyadenylation event dataset was further analyzed by the fit_repac function to identify the differential events between high-grade and low-grade tumors and TANs and Black and White patients.

Gene set enrichment analysis (GSEA)

Genes undergoing ARS were ranked according to absolute ∆PSI and analyzed using fgsea74. In GSEA, the default option for the fgsea function was applied, except for the nperm option, which was set to 10000 and the seed of R was set to 1991. In our analysis, we defined significant enrichment as having a normalized enrichment score (NES) > 1.5, and a p value < 0.05.

Probability of BCR analysis by self-reported race and Gleason grade

For the BCR analysis, the survival R package 3.5.5 was used75. Kaplan–Meier estimates by self-reported race and Gleason grade were plotted using the survminer R package 0.4.976. The p value was calculated using the log-rank test. A Cox proportional hazards model was used to test the association between self-reported race and BCR, adjusting for surgical pathology Gleason score, age, BMI, and PSA.

BCR association with the level of genetic ancestry-concordant ARSs

BCR data for 71 GENCADE patients from whom tumor and TAN RNA were analyzed for ARS and DE were fitted, and the BCR analysis was conducted using the Cox regression model according to PSI quartiles of indicated ARS: 1st, 4th and (2nd+3rd). The HR for the upper 1st, lower 4th and middle (2nd +3rd) quartiles and p values were calculated using the log-rank test. Kaplan-Meier estimates by PSI of a given ARS were plotted using the survminer R package 0.4.976.