Fig. 7: Multimodal fusion performance of overall survival prediction on pathological slides and gene expression data.
From: A multimodal knowledge-enhanced whole-slide pathology foundation model

The patch extractors of all foundation models are evaluated with different multimodal fusion models (MCAT, Porpoise, MOTCat and CMTA), trained from scratch across 9 TCGA held-out datasets. a Performance of Ranking on 9 datasets of each FM on every multimodal fusion models and “Overall” that refers to the average results among these multimodal fusion methods. b The average C-Index on 9 datasets. c Performance (C-Index and 95% CI) on each dataset. The minima and maxima represent the lower and upper bounds of 95%CI, respectively. The center and the bound of box represent the mean value, 25% and 75% percentiles, respectively. P-value is given through one-sided Wilcoxon signed-rank test between mSTAR and the second-best FM. The colors of legends are shared across all sub-figures. * represents P < 0.05, ** means P < 0.01 and *** indicates P < 0.001. Detailed performances of every dataset are presented in Supplementary Table 19–23. Source data are provided as a Source Data file.