Fig. 3: Multivariate models developed using extracted morphometric features predict aggressiveness in ccRCC. | Nature Communications

Fig. 3: Multivariate models developed using extracted morphometric features predict aggressiveness in ccRCC.

From: PHARAOH: A collaborative crowdsourcing platform for phenotyping and regional analysis of histology

Fig. 3

A Schematic of the hand-crafted 3-feature model designed to capture key aspects of clear cell Renal Cell Carcinoma (ccRCC) Fuhrman grading. Created with Biorender.com. Diamandis, P. (2025) https://BioRender.com/c69l485. B Sample case input (top) and Class Activation Map (CAM) output (bottom) of a representative case from The Cancer Genome Atlas Kidney Renal Clear Cell Carcinoma (TCGA-KIRC) cohort. Custom region of interest (ccRCC) is shown in brown while normal renal parenchyma and fibroconnective tissue are shown in cyan and yellow, respectively. Scale bars = 2 mm. C, D Box plots showing aggregate nuclear feature values, separated by their pathologist-reported nuclear grades, in the TCGA-KIRC study (test set, n = 446 subjects, G1-4: 12,194,183,57 respectively) and in a local cohort (n = 35 subjects, G1-4: 8,10,9,8 respectively) (Saint Michael’s Hospital; SMH), respectively. Legend is shown between panels. E Kaplan–Meier (KM) survival curves for the TCGA-KIRC cohort split into “high” (yellow) and “low” (blue) aggregate nuclear feature score groups based on the overall cohort’s median value. F Variable importance in the XGBoost model for survival, trained with the 160 morphometric features. G, H Box plots showing predicted risk scores stratified by nuclear grade, in the TCGA-KIRC study (test dataset, n = 242 subjects, G1-4: 5,111,98,28) and in a local cohort (n = 35 subjects, G1-4: as above), respectively. Legend is shown between panels. I KM analysis for the TCGA-KIRC cohort (test dataset, n = 242 subjects) split into groups with “high” (pink) and “low” (turquoise) risk scores shows a more pronounced survival difference than the former hand-crafted model. All box plots in this figure show minimum, first quartile, median, third quartile, and maximum. p-value thresholds for box plots are denoted as follows using a 2-sided ANOVA test: *p < 0.05, **p < 0.01 and ***p < 0.001. NS = not significant. P-values for KM survival curves represent 2-side log rank tests. Shaded bands show 95% confidence intervals of the variance in survival estimates (standard deviation). Corrections for multiple comparisons were not relevant to these analyses. All relevant source data for this figure are provided as Supplementary Data files.

Back to article page