Table 2 The model reduction and the partial likelihood ratio (LR) test revealed that a baseline model with the novel risk groups is statistically comparable to the full model to predict cancer-specific survival.

Model	Nested partial likelihood ratio test, LR (p-value)	AIC	BIC
1st external validation set (CPCBN)
Baseline model + risk group + GG	Reference (full model)	167.5161	169.6403
Baseline model + risk group	1.560 (0.213)	167.07	168.4861
Baseline model + GG	4.114 (0.036)	169.6098	171.0259
Baseline model (pT)	7.557 (0.027)	171.0623	171.7703
2nd external validation set (PROCURE)
Baseline model + risk group + GG	Reference (full model)	170.6459	174.824
Baseline model + risk group	4.094 (1.2e−1)	172.7398	175.8733
Baseline model + GG	10.056 (1.8e−3)	178.7017	181.8353
Baseline model (pT + pN)	27.777 (6.8e−5)	194.4228	196.5119
3rd external validation set (PLCO)^a
Baseline model + risk group + GS	Reference (full model)	298.0581	301.8323
Baseline model + risk group	14.849 (1e−4)	310.9075	313.4237
Baseline model + GS	5.15 (0.023)	301.2158	303.732
Baseline model (prostate pathologic stage)	26.769 (1.54e−06)	320.8274	322.0855

In contrast, GG (Gleason score/ISUP grade groups) was not comparable to the full model. Akaike information criterion (AIC) and Bayesian information criterion (BIC) support this finding as well since the fit of a baseline model with the novel risk groups is better than the fit of a baseline model with GG. pT: pathologic tumor stage; pN: pathologic nodal stage. +pN was excluded due to non-significance to prognose cancer-specific survival in the CPCBN external validation set. For PLCO external validation set, we used GS provided by the study instead of GG and prostate pathologic stage (considers T, N, and M stages) due to the study history. The best-performing models are highlighted in bold. Higher AIC and BIC are associated with the worst model fitness.
^aSince both GS and risk groups were significantly inferior than the full model, we applied the non-nested partial likelihood ratio test to compare between GS Cox model and risk group Cox model; our risk group demonstrated non-inferiority to GS, indicating comparable goodness of fit (z = 1.091, p = 0.138). No significant multicollinearity between variables was identified (VIF < 2).

Search