Fig. 4

Detailed comparison with PCA. Detection performance of SDCM (top) and PCA (bottom) for different test scenarios. Colors indicate correlations of detected gene axes (respectively PCs) to simulated ones (Supplementary Note 8). Each matrix column represents a simulated signature; each row a detected one. Red diagonals indicate high detection sensitivity, dark off-diagonals high specificity. Bright pixels in the same row indicate a detected gene axis (or PC) that represents a mixture of multiple simulated signatures. Bright pixels in the same column indicate a simulated signature signal that was split over several detected ones. a Representative results from SDCM and PCA for the 7-signature versatility test (from Fig. 3; all runs in Supplementary Figs. 1 and 6). We present SDCM gene axes respectively PCs in best-matching order. SDCM detected all seven signatures and no FPs. PCA yielded 100 PCs; the top 7 by variance are shown. PCA performed quite well but mixed signatures #2 and #4. b Representative results for the 13-signature versatility test (all runs in Supplementary Fig. 9). SDCM yielded one FP and slightly mixed signatures #8 and #9. Otherwise, it retained high detection specificity. The top 13 PCs showed considerably more mixing; only signature #1 retained high specificity. c Representative results for the superposition test with 16 instances of signature #3 (from Fig. 3a; complete series in Supplementary Fig. 11). SDCM maintained high detection performance, while PCA’s sensitivity and specificity broke down. d For a versatility test as in (a), but after random deletions resulting in 80% missing values (Supplementary Fig. 15), SDCM still detected 4/7 signatures with high sensitivity. (PCA does not support signals containing missing values).