Fig. 5: Distribution of Dice scores on the TCGA-STAD test set.
From: Geometric multi-instance learning for weakly supervised gastric cancer segmentation

Each violin plot shows the probability density of the Dice score for a given method across all test slides. TransMIL shows high variance. Patch-WI improves the median but remains inconsistent. HistoGraph39 is competitive but has a significant number of low-performing outliers. Our Geo-MIL framework demonstrates both the highest median performance and the lowest variance, indicating superior accuracy and robustness.