Fig. 3: Minimalist pan-cancer classifier for 28 cancer classes (n = 2575 cases, 53 unique CpG probes, accuracy = 94.5%). | Modern Pathology

Fig. 3: Minimalist pan-cancer classifier for 28 cancer classes (n = 2575 cases, 53 unique CpG probes, accuracy = 94.5%).

From: Minimalist approaches to cancer tissue-of-origin classification by DNA methylation

Fig. 3

a t-distributed stochastic neighbor embedding (t-SNE) plot for the smallest hybrid model based on information from 53 unique CpG sites shows excellent separation of cancer classes. b Heat map for the confusion matrix for the smallest hybrid model; see Supplementary Table 12 for the numbers of cases in each cell. c The relationship between classifier confidence and accuracy: the numbers of cases/percentages of validation set/accuracies for the high, moderate, and low confidence groups are 994 cases/39% of validation cases/100% accuracy, 1147/45% of cases/98% accuracy, and 434/17% of cases/73% accuracy, respectively. d Correctly classified cases have statistically higher tumor purities compared with incorrectly classified cases (Wilcoxon test p value = 2.8 × 10−4), although the difference in the distributions is modest. e Density scatter plot showing a direct correlation between purity and prediction confidence (Spearman rho = 0.19); many TCGA cases have fairly high purities (>50%) and many have high confidence predictions. Conceivably, these 53 probes could be quantitatively evaluated via next-generation sequencing. ACC adrenocortical carcinoma, BLCA bladder carcinoma, BRCA breast invasive carcinoma, CESC cervical and endocervical cancers, CHOL cholangiocarcinoma, CORE colorectal adenocarcinoma, DLBC diffuse large B-cell lymphoma, ESCC esophageal squamous cell carcinoma, GBMLGG glioma (glioblastoma and low-grade glioma), GEAD gastric and esophageal carcinoma, HNSC head and neck squamous cell carcinoma, KIPAN pan-kidney cohort (clear cell, chromophobe, and papillary renal cell carcinoma), LAML acute myeloid leukemia, LIHC liver hepatocellular carcinoma, LUAD lung adenocarcinoma, LUSC lung squamous cell carcinoma, MESO mesothelioma, PAAD pancreatic adenocarcinoma, PCPG pheochromocytoma and paraganglioma, PRAD prostate adenocarcinoma, SARC sarcoma, SKCM skin cutaneous melanoma, TGCT testicular germ cell tumor, THCA thyroid carcinoma, THYM thymoma, UCEC uterine corpus endometrial carcinoma, UCS uterine carcinosarcoma, UVM uveal melanoma.

Back to article page