Fig. 3: Performance of pathological diagnosis on 21 datasets. | Nature Communications

Fig. 3: Performance of pathological diagnosis on 21 datasets.

From: A multimodal knowledge-enhanced whole-slide pathology foundation model

Fig. 3: Performance of pathological diagnosis on 21 datasets.

a The overall performance on pathological diagnosis. b The performance on 8 independent datasets. c The performance on 10 external datasets. The red lines and the values reported at the top of figures (a, b and c) refer to the averaged performance across datasets. Each point represents a dataset, with the size of the point indicating the standard deviation. d The performance on 3 held-out datasets. The minima and maxima bounds of boxes represent the minimum and maximum performance among corresponding datasets, respectively. e Task distribution of pathological diagnosis across sites for different evaluation. f The overall performance on Pathological Subtyping across 10 datasets. g The performance on 6 external datasets of Pathological Subtyping. Error bars represent standard errors across datasets for all bar plots in (f–g). h, i The visualized validation of attention scores from mSTAR on h) CAMELYON and i) PANDA datasets. P-value for every group of experiments is given through one-sided Wilcoxon signed-rank test between mSTAR and the second-best FM. * represents P < 0.05, ** means P < 0.01 and *** indicates P < 0.001. Detailed Performances of every dataset are presented in Supplementary Fig. 2 and Supplementary Table 7. Source data are provided as a Source Data file.

Back to article page