Background
The Prostatype® Test evaluates expression levels of three stem cell genes (IGFBP3, F3, and VGLL3), which are combined with PSA, stage, and grade to calculate P-score. Previous research found P-score accurately predicts prostate cancer (PC) specific mortality (PCSM) in patients with newly diagnosed clinically localized PC. We evaluated the performance of P-score to predict PCSM in a large, multiethnic cohort from the Veterans’ Administration (VA).
Methods
After pathologic review to ensure sufficient tumor tissue, formalin-fixed paraffin-embedded (FFPE) biopsy cores from patients with newly diagnosed PC at the Durham VA were sent to an academic medical center. There, cores were sectioned, RNA extracted, and reverse transcription quantitative polymerase chain reaction (RT-qPCR) tests conducted for IGFBP3, F3, VGLL3, and GAPDH (control). Results were combined with clinical data to generate P-scores. The association between P-score and PCSM was evaluated using c-index, Cox and Fine-Gray models, and decision curve analysis (DCA).
Results
Higher P-scores were significantly associated with a higher risk of PCSM (HR = 1.48 per 1 unit increase in P-score, 95% CI: 1.20–1.84, p <0.001) and accurately estimated PCSM (c-index = 0.87). Adding clinical variables to P-score only incrementally improved accuracy. The DCA indicated P-score provided net clinical benefit for patients with PCSM risk between 5% and ~50%. As P-score strongly correlated with risk group, we tested the value of P-score in intermediate-risk patients specifically, where it significantly predicted PCSM (HR 1.43, 95% CI: 1.09–1.86, p = 0.009).
Conclusion
In this American cohort of veterans, P-score significantly predicted PCSM. Adding clinical variables minimally improved accuracy. Accuracy remained high in intermediate-risk patients, wherein there is arguably the greatest need for better risk stratification. Given P-scores can be generated rapidly in-house using a standardized RT-qPCR assay, P-score represents a robust new tool to risk-stratify newly diagnosed patients for PC death, thereby minimizing mismatched treatments.
Similar content being viewed by others
Introduction
Despite advances in PC diagnosis and risk stratification, current methods of predicting PC aggressiveness remain suboptimal. This is mainly because methods rely on qualitative clinical assessments and human interpretation rather than validated quantitative tests that consider individual patients’ PC pathology. Accordingly, patients with PC often receive mismatched treatments, leading to undertreatment of aggressive disease or overtreatment of indolent disease. Several molecular tests were developed to address this. All existing tests require centralized reference laboratories and cannot be performed locally.
To overcome limitations of other tests, Prostatype® Test, a standardized RT-qPCR assay, was developed to evaluate the expression of stem cell genes IGFBP3, F3, and VGLL3 (the three-gene signature). Prostatype® Test can be performed locally either in hospital molecular laboratories (Europe) or designated CLIA-certified facilities (U.S.) (i.e., “in-house”), allowing faster and more flexible implementation.
The three-gene signature significantly predicted overall and PC-specific mortality (PCSM) in a Swedish cohort of 189 PC patients diagnosed between 1986 and 2001 [1]. Another study evaluated whether adding the expression levels of IGFBP3 and F3 from formalin-fixed paraffin-embedded (FFPE) prostate biopsies could improve the prediction of overall survival compared to clinical parameters alone in 241 PC patients. The results showed that combining the three-gene signature with PSA, Gleason score, and tumor stage at diagnosis significantly improved survival prediction accuracy [2]. A follow-up study developed the integrated Prostatype score (P-score) by combining the three-gene signature with serum PSA, Gleason score, and clinical T-stage. The study demonstrated that continuous P-score (0–15) could be categorized into low- (0–2), intermediate- (3–5), and high-risk (6–15) groups, which significantly predicted PCSM [3]. The categorized P-score has since been locked and validated in independent cohorts from Taiwan, Spain, and Sweden [4,5,6]. Whether P-score predicts outcomes in more diverse populations outside of Europe and Asia remains untested.
To assess the performance of P-scores in an American cohort, we analyzed a multiethnic cohort of PC patients from a Veterans Administration (VA) hospital and conducted assays in-house at an academic center, correlating resultant P-scores with PCSM among patients newly diagnosed with clinically localized PC. We hypothesized P-scores would significantly and accurately predict PCSM, providing unique information above and beyond standard clinical variables.
Methods
Study design and participants
After obtaining approval with waivers of written consent from the Durham VA IRB, we identified patients diagnosed January 1, 2002, to December 31, 2019, at the Durham VA with very low- to high-risk PC [7] with diagnostic biopsy tissue available. Patients were excluded if they had a history of cancers other than PC (excluding basal cell or squamous cell skin cancers) prior to PC diagnosis, were diagnosed with very high-risk PC, or had clinical evidence of metastasis at diagnosis. After reviewing charts and confirming tissue availability, 1531 patients were eligible (Supplementary Fig. 1).
Patients’ biopsy tissue blocks and their corresponding hematoxylin and eosin slides were reviewed by an expert pathologist. 729 patients were excluded after pathology review due to a lack of or limited tissue. FFPE tissue blocks from the remaining 802 unique patients were sent to Cedars-Sinai Medical Center for further evaluation. Four to ten sections (dependent on tissue availability) at ≥8 µm thickness were sectioned from the FFPE blocks and reviewed by a molecular pathologist to select samples with ≥50% tumor content. This step resulted in the exclusion of 452 patients. Sample insufficiency is attributed to increased tissue requirements of pre-2012 PC diagnosis methods and outdated, unstable FFPE materials from that time.
From the remaining 349 patients, RNA was extracted and underwent RT-qPCR assay for the three-gene signature, as well as GAPDH. P-scores were calculated using a locked and validated algorithm that integrates gene expression of VGLL3, IGFBP3, and F3 (measured as ΔCT values normalized to GAPDH) together with clinical variables (PSA, Gleason score, and clinical T-stage). The resulting score (the exact algorithm is proprietary) ranges from 0 to 15, with higher values indicating increased risk [3]. Of the 349 patients tested, more samples from 2013 to 2019 met the GADPH threshold for Prostatype Testing compared to samples from 2002 to 2012 (Supplementary Table 1), and samples from 2018-2019 outperformed all others. Given that a prior study found that the choice of biopsy core did not impact the prognostic performance of P-scores [8], and a more recent validation confirmed consistent test results across tumor foci [9], we included two tissue samples for some patients (N = 6). P-scores for these 6 patients matched exactly (N = 2), differed by 1 point (N = 2), and differed by 2 points (N = 2). Importantly, none of these differences placed patients in higher (or lower) P-score risk groups. When more than one P-score was available, we used the higher score. In total, 160 patient samples met P-score threshold for GAPDH for ΔCT values (CT values < 28) and were included in the final study population.
Statistical analysis
Descriptive statistics were generated for patient characteristics (Table 1) with medians (IQR) for continuous variables (age, BMI, prostate and PSA characteristics, and follow-up time); frequencies and percentages for categorical variables (biopsy characteristics, grade group, primary therapy type, NCCN risk group and cause of death); and P-scores divided into a priori defined low- (0–2), intermediate- (3–5) and high-risk (6–15) groups and stratified.
We evaluated differences in characteristics among risk groups using Kruskal–Wallis tests for continuous variables. Fisher’s exact test assessed any association between race, cancer characteristics, and cause of death among risk groups.
As competing risks were present in our analysis of PCSM (i.e., death from non-PC causes), we modeled both cause-specific hazard and sub-distribution hazard under the Fine-Gray method. As these two models can yield different results, prior work recommended presenting both [10].
Cumulative incidence functions were estimated and stratified by pre-defined P-score groups: low (0–2), intermediate (3–5), and high (6–15). Univariable cause-specific Cox and Fine-Gray models were fitted using P-score as a continuous variable to predict PCSM. Due to a few PC-specific deaths, we could not perform a full multivariable model without overfitting. Thus, we tested whether P-score provided information independent from its association with PCSM after adjusting for PSA, grade, and NCCN risk group, each in separate models.
We determined the accuracy of P-scores to predict death using concordance indices (c-index), and Areas Under the Curve (AUCs) to predict death at 10 years. The c-index over the follow-up period was based on the cause-specific Cox model assessing PCSM. The model generated risk scores for each unique patient, and the c-index was calculated by comparing all possible pairs to determine how well the model ranked those at higher risk of PCSM. To assess the clinical benefit of P-scores in predicting PCSM at 10 years, we used a decision curve analysis (DCA).
Results
Baseline characteristics
One hundred sixty patients, for whom demographic and clinical characteristics by P-score may be found in Table 1, were included in the cohort, with a median age of 64.5 years and a median PSA at diagnosis of 7.7 ng/mL. In this cohort, most patients were Black (73%) and the rest were White (27%) or American Indian/Native Alaskan (<1%). Most patients underwent radical prostatectomy (RP) (38%) or radiation therapy (43%).
P-scores ranged from 0 to 15, with a median of 5. When stratified by pre-defined thresholds, patients with high P-scores tended to have signs of more aggressive disease including more positive cores (p <0.001), higher clinical stage (p <0.004), higher grade group (p <0.001), higher PSA (p <0.001), higher NCCN risk group (p <0.001), and were more likely to have died from PC (p <0.001). Reflecting their higher-risk profile, patients with elevated P-scores were more frequently treated with hormonal therapy alone and were less likely to undergo RP (p <0.001). Notably, 6 patients (9.5%) with high-risk P-scores were managed with either no treatment or active surveillance. Several of these patients experienced poor outcomes, consistent with undertreatment relative to their high molecular risk profile.
P-score and PC death
During a median follow-up of 7.5 years, 51 patients died—14 from PC and 37 from causes other than PC.
When modeled as a continuous variable, higher P-scores were significantly associated with increased risk of PCSM (HR = 1.48 per unit increase, 95% CI: 1.20–1.84; p <0.001; Table 2). P-score significantly separated groups based on PCSM risk (Fig. 1).
Due to a few PC deaths, we were limited from performing a full multivariable model. Thus, to test whether P-score provided independent information, we adjusted for PSA, grade, and NCCN risk group in separate models (Table 2). In subsequent Fine-Gray competing risk models, P-score remained a significant predictor of PCSM regardless of clinical variable adjusted for, with HRs per 1 unit of P-score ranging from 1.19 to 1.38 (all p ≤ 0.024).
Accuracy for predicting PC death
Given P-score significantly predicted PCSM after adjusting for key clinical variables, we next asked how accurately it could assess PCSM risk using both c-index (0.87) and AUC (0.80) predicting PCSM within 10 years after diagnosis (Table 3).
P-score demonstrated high accuracy for predicting PCSM as well as 10-year PCSM, outperforming PSA and grade individually. Although the NCCN risk group, which incorporates PSA, grade, and clinical stage, showed the highest accuracy among models using clinical variables, it remained inferior to P-score. Adding clinical variables to P-score resulted in only marginal improvements in predictive performance.
Decision curve analysis
To evaluate the clinical benefit of P-score, we performed a DCA (Fig. 2). The analysis showed that using P-score to guide clinical decision-making yielded positive net benefit across a wide range of threshold probabilities for PCSM, ranging from 5 to ~50%.
Sensitivity analysis
Given the lack of PC deaths in the low-risk group and that P-score strongly correlated with the risk group, we tested the value of P-score in intermediate-risk patients, the group that would arguably benefit most from further risk stratification. Even in this more restrictive population, P-score remained a significant predictor of PCSM on Fine-Gray analysis with very similar hazard ratios to the full cohort (HR 1.43, 95% CI: 1.09–1.86, p = 0.009).
Discussion
Risk stratification of clinically localized PC remains suboptimal. Quick, easily obtained biomarkers are needed to improve predictions of PC aggressiveness. We evaluated P-score to understand its capability in predicting PCSM in our multiethnic cohort of PC patients. We found P-score was a significant predictor of PCSM with high accuracy. Adjustment for or addition of standard clinical variables had a limited incremental impact on the predictive accuracy, highlighting P-score’s potential value as a robust prognostic tool. Importantly, it provided accurate risk assessment even in intermediate-risk patients. These data support that P-score is an accurate and valid predictor of PCSM among patients with clinically localized PC.
Historically, PC risk stratification was based on PSA, stage, and grade [11]. More recently, gene expression levels within the tumor have gained interest. There are 3 commercially available tests using tumor gene expression to aid risk stratification [12]. While all three tests perform well, they are sent out to labs, which increases wait times for results. As such, an unmet need in the field is a fast, flexible, in-house assay that accurately predicts PCSM.
In our cohort of newly diagnosed PC patients, we found P-score significantly predicted PCSM with high accuracy (c-index = 0.87). Prior studies have also shown high accuracy in newly diagnosed PC patients, including in a cohort of 316 in Sweden (AUC = 0.93) [6], a cohort of 93 in Spain (AUC = 0.81) [5], and a cohort of 92 in Taiwan (c-index = 0.90) [4]. These findings validate P-score performance across diverse healthcare settings and patient populations. These studies, together with ours, suggest a performance of P-score ranging from 0.80 to 0.93. Notably, this level of accuracy for predicting PCSM compares well to other commercially available tests for PC risk stratification, such as Decipher (c-index=0.85) [13], Prolaris (c-index=0.78 for 10-year PCSM) [14], and Oncotype (time-dependent AUC = 0.84) [15]. Our current results, along with those from prior studies, support that P-score, easily generated in-house, can accurately predict PCSM and may be a valuable tool for PC risk stratification.
When developing new biomarkers, it is crucial to ask whether they provide information above and beyond what can be obtained by standard clinical variables. It is notable that P-score outperformed PSA, grade, or risk group, which combines PSA, stage, and grade. P-score’s superior performance compared to these factors alone suggests that the three-gene signature adds unique prognostic value. This is reflected in Table 3, where P-score achieved higher accuracy than PSA or grade individually, and in multivariable Fine-Gray models, where P-score remained a significant predictor of PCSM even after adjustment for these clinical variables. Moreover, the addition of clinical variables to P-score resulted in minimal improvement in accuracy. Within the limitations of the small sample size and few PC deaths, P-score provided unique information above and beyond standard clinical variables. Indeed, P-score provided net clinical benefit beyond treating all or no patients within a PCSM range of ~5–50%. This range reflects common clinical decision thresholds when considering initiation or escalation of treatment in patients with localized PC, suggesting P-score could potentially help avoid overtreatment in low-risk individuals and undertreatment in those at higher risk. Importantly, we found a subset of patients with high P-scores managed conservatively, with either no treatment or active surveillance. These patients experienced adverse outcomes, suggesting that clinical parameters may have underestimated the biological aggressiveness of their disease. This highlights a potential clinical application of P-score—identification of patients at risk for undertreatment, thereby improving decision-making for patients with high-risk molecular profiles.
Notably, intermediate-risk PC presents a great challenge in accurate risk stratification. An important finding from our study was that P-score remained a significant predictor of PCSM among intermediate-risk patients, with accuracy on par with the full cohort. While further confirmation in larger cohorts is needed, these data support the use of P-score across the full spectrum of patients, most notably among intermediate-risk patients, where there is a great unmet need.
One key strength of P-scores is that they can be generated rapidly in-house in hospital molecular laboratories (Europe) or in designated CLIA-certified facilities (U.S.). As such, they are differentiated from other commercially available test results, which require the use of centralized reference labs. Intuitively, this should lead to lower costs. Indeed, a prior Swedish study found that use of P-scores was not only associated with improved quality-adjusted life years but also lowered costs [16]. Though this would need confirmation in other healthcare systems, including the US, it is noteworthy that a test might be cost-effective and lower healthcare costs.
An important strength of this study is that most PC patients were African American (73%), a group historically underrepresented in genomic validation studies. This complements prior studies in European and Asian populations and enhances the relevance of our findings for addressing PC disparities. We also tested whether P-score provided information above and beyond standard clinical variables and assessed its potential specifically among intermediate-risk patients. Finally, we measured PCSM rather than intermediate endpoints, which are not as well-linked with PCSM.
These strengths notwithstanding, our study had some limitations. The number of included patients was modest, and the number of PC deaths was low. This limited our power to test other PC endpoints and for a more robust multivariable adjustment. These low numbers reflect a high drop-out rate due to small amounts of tissue available in prostate biopsies. Prior to ~2016, practice patterns at the Durham VA were to include all cores from a single site (i.e., left vs. right) into one block and use a greater number of sections to make diagnoses, leaving less residual tissue for research. As such, it was sometimes challenging to create sections that contained sufficient tumor for analysis. Also, some patients in this study were included in prior research studies, and there was not sufficient tumor tissue remaining to generate P-scores. Likewise, obtaining RT-qPCR-quality RNA on 20+ year old samples was limiting. Newly acquired cores in separate jars and undegraded RNA (i.e., samples from 2018 to 2019) had higher yields for use in calculating P-scores (Supplementary Table 1). Finally, patients in this study received heterogeneous treatments, potentially impacting PCSM risk. However, we were not powered to test this or stratify patients by treatment received. As such, further testing in larger cohorts is warranted.
In a multiethnic cohort of PC patients from a VA hospital, Prostatype P-scores, derived using standardized in-house RT-qPCR assays, accurately predicted PCSM beyond standard clinical information and performed nearly equally among intermediate-risk patients as all patients. These findings support the integration of P-scores into clinical workflows for quick, accurate risk stratification of newly diagnosed PC patients, particularly among those with intermediate-risk disease.
References
Peng Z, Skoog L, Hellborg H, Jonstam G, Wingmo IL, Hjälm-Eriksson M, et al. An expression signature at diagnosis to estimate prostate cancer patients’ overall survival. Prostate Cancer Prostatic Dis. 2014;17:81–90.
Peng Z, Andersson K, Lindholm J, Dethlefsen O, Pramana S, Pawitan Y, et al. Improving the prediction of prostate cancer overall survival by supplementing readily available clinical data with gene expression levels of IGFBP3 and F3 in formalin-fixed paraffin embedded core needle biopsy material. PLoS ONE. 2016;11:e0145545.
Soderdahl F, Xu LD, Bring J, Haggman M. A novel risk score (P-score) based on a three-gene signature, for estimating the risk of prostate cancer-specific mortality. Res Rep Urol. 2022;14:203–217.
Pang ST, Lin PH, Berglund E, Xu L, Shao IH, Yu KJ, et al. First validation of the Prostatype® P-score in an Asian cohort: improving risk stratification for prostate cancer. BJUI Compass. 2025;6:e70026.
González-Peramato P, Álvarez-Maestro M, Heredia-Soto V, Mendiola Sabio M, Linares E, Serrano Á, et al. Comparing Prostatype P-score and traditional risk models for predicting prostate cancer outcomes in Spain. Actas Urol Esp (Engl Ed). 2025;49:501788.
Saemundsson A, Xu LD, Meisgen F, Cao R, Ahlgren G. Validation of the prognostic value of a three-gene signature and clinical parameters-based risk score in prostate cancer patients. Prostate. 2023;83:1133–40.
NCCN. NCCN clinical practice guidelines in oncology: prostate cancer. 2025 [cited 2025 October 1]; Version 2.2026. Available from: https://www.nccn.org/professionals/physician_gls/pdf/prostate.pdf
Peng Z, Andersson K, Lindholm J, Bodin I, Pramana S, Pawitan Y, et al. Operator dependent choice of prostate cancer biopsy has limited impact on a gene signature analysis for the highly expressed genes IGFBP3 and F3 in prostate cancer epithelial cells. PLoS ONE. 2014;9:e109610.
Röbeck P, Xu L, Ahmed D, Dragomir A, Dahlman P, Häggman M, et al. P-score in preoperative biopsies accurately predicts P-score in final pathology at radical prostatectomy in patients with localized prostate cancer. Prostate. 2023;83:831–9.
Austin PC, Lee DS, Fine JP. Introduction to the analysis of survival data in the presence of competing risks. Circulation. 2016;133:601–9.
Partin AW, Kattan MW, Subong EN, Walsh PC, Wojno KJ, Oesterling JE, et al. Combination of prostate-specific antigen, clinical stage, and Gleason score to predict pathological stage of localized prostate cancer. A multi-institutional update. JAMA. 1997;277:1445–51.
Cucchiara V, Cooperberg MR, Dall’Era M, Lin DW, Montorsi F, Schalken JA, et al. Genomic markers in prostate cancer decision making. Eur Urol. 2018;73:572–82.
Howard LE, Zhang J, Fishbane N, Hoedt AM, Klaassen Z, Spratt DE, et al. Validation of a genomic classifier for prediction of metastasis and prostate cancer-specific mortality in African-American men following radical prostatectomy in an equal access healthcare setting. Prostate Cancer Prostatic Dis. 2020;23:419–28.
Cuzick J, Swanson GP, Fisher G, Brothman AR, Berney DM, Reid JE, et al. Prognostic value of an RNA expression signature derived from cell cycle proliferation genes in patients with prostate cancer: a retrospective study. Lancet Oncol. 2011;12:245–55.
Janes JL, Boyer MJ, Bennett JP, Thomas VM, De Hoedt AM, Edwards VD, et al. The 17-gene genomic prostate score test is prognostic for outcomes after primary external beam radiation therapy in men with clinically localized prostate cancer. Int J Radiat Oncol Biol Phys. 2023;115:120–31.
Fridhammar A, Frisell O, Wahlberg K, Berglund E, Röbeck P, Persson S. Prognostic testing for prostate cancer—a cost-effectiveness analysis comparing a prostatype P-Score Biomarker approach to standard clinical practice. Pharmacoeconomics. 2025;43:509–20.
Funding
Sponsored by Prostatype Genomics, Augustendalsvägen 20, 131 52 Nacka Strand, Sweden. Open access funding provided by SCELC, Statewide California Electronic Library Consortium.
Author information
Authors and Affiliations
Contributions
AM managed the project at Cedars-Sinai and led the manuscript writing process.TDT conducted statistical analyses on samples. EB and JA created the Prostatype testing protocols. CA and AES conducted a pathological review of the samples. KB, IER, and SS managed the tissue samples at Cedars-Sinai. HB and AJ managed the project at the VA. AH and MB supervised the use of samples at the VA. EV did a pathological review of samples and oversaw testing of samples. AP, RK, and AA conducted and did troubleshooting of the Prostatype testing at Cedars-Sinai. SJF was the PI, obtaining funding for the study, supervising it, and the manuscript writing process.
Corresponding author
Ethics declarations
Competing interests
Two authors, Emelie Berglund, PhD and Gerald L. Andriole, MD, are from Prostatype Genomics, the entity developing the Prostatype Test evaluated in this study.
Ethics approval and consent to participate statement
The study was approved by the ethics committee of Cedars-Sinai (IRB #STUDY00002370) on 7/6/23. Written informed consent was obtained from all individual participants included in the study. All procedures performed were in accordance with the ethical standards of the institutional and/or national research committee and with the 1964 Helsinki declaration and its later amendmen comparable ethical standards.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Mack, A., Tran, T.D., Berglund, E. et al. Validation of the Prostatype® P-score for predicting prostate cancer specific mortality in a multiethnic U.S. veterans cohort. Prostate Cancer Prostatic Dis (2026). https://doi.org/10.1038/s41391-025-01070-8
Received:
Revised:
Accepted:
Published:
Version of record:
DOI: https://doi.org/10.1038/s41391-025-01070-8




