Extended Data Fig. 8: The models’ average performance under real-world setting (n = 950).
From: A generalist foundation model and database for open-world medical image segmentation

a. Average zero-shot performance evaluation of 6 models on 5 real-world datasets. b. Average fine-tuning performance evaluation of 6 models on 5 real-world datasets. P-values are calculated with two-sided t-test. Bar graphs indicate the mean ± 95% CI. The dashed horizontal line represents the Dice score achieved by MedSegX.