Fig. 4: Validation of a crossNN pan-cancer classifier. | Nature Cancer

Fig. 4: Validation of a crossNN pan-cancer classifier.

From: crossNN is an explainable framework for cross-platform DNA methylation-based classification of tumors

Fig. 4

a,b, Overview of the pan-cancer training dataset. Uniform manifold approximation and projection (UMAP) dimensionality reduction depicts the reference dataset of 8,382 reference tumors (a), including four major groups of tumors (b). c, Confusion matrix showing the internal validation of the crossNN pan-cancer model (n = 8,382 training samples). du, Independent validation of the model across different platforms. d,g,j,m,p,s, Distribution of the number of CpG features used as input to the crossNN model: 450K (d), EPIC (g), nanopore R9 (j), nanopore R10 (m), targeted sequencing (p) and WGBS (s). e,h,k,n,q,t, Waterfall plots of cohorts with samples ranked according to confidence score. The dashed lines indicate platform-specific cutoff values chosen based on fivefold CV. f,i,l,o,r,u, Receiver operating characteristics of confidence scores regarding the correct classification on MC versus MCF level. v,w, Accuracy (v) and precision (w) in the validation cohort per major tumor group across all platforms (carcinoma n = 3,005, hematolymphoid n = 32, neuroepithelial n = 2,079, sarcoma n = 263 cases, respectively). x, Classification of renal cell carcinoma. The confusion matrix shows fractions relative to the total number of cases per subtype (kidney chromophobe renal cell carcinoma (KICH) n = 20, kidney renal clear cell carcinoma (KIRC) n = 107, kidney renal papillary carcinoma (KIRP) n = 86 cases, respectively). The columns indicate the ground truth, the rows indicate the crossNN predictions. BLCA, bladder urothelial carcinoma.

Source data

Back to article page