Table 3 Quantitative performance metrics of the fine-tuned SAM 2 model across various datasets and classes, showing Dice coefficients for different training data scales (50, 100, 200, and 400 samples per class) with 10 prompt points. Comparisons with prior SOTA methods are highlighted in the last column with the deltas in parentheses, improvements in green and declines in red.
From: A fine-tuned foundational model SurgiSAM2 for surgical video anatomy segmentation and detection
Datasets and their organ classes (n = number of examples in validation, test subsets) | Training data scale | Prior SOTA score | Prior SOTA model | MedSAM (vit-b) | Test subset with 1 prompt point) | Test subset with 10 prompt points (delta) | |||
|---|---|---|---|---|---|---|---|---|---|
50/class | 100/class | 200/class | 400/class | ||||||
Number of point prompts (10) | |||||||||
CholecSeg8k (4548, 3611) | |||||||||
 Abdominal wall (951, 943) | 0.97 | 0.97 | 0.97 | 0.97 | 0.84 | DynUnet33 | 0.30 | 0.88 | 0.96 (+ 0.12) |
 Fat (720, 640) | 0.94 | 0.95 | 0.96 | 0.97 | 0.91 | U-net +  + 33 | 0.21 | 0.89 | 0.96 (+ 0.05) |
 Gallbladder (720, 537) | 0.90 | 0.90 | 0.91 | 0.91 |  | Segmenter34 | 0.41 | 0.90 | 0.91 |
 Gastrointestinal tract (557, 668) | 0.96 | 0.95 | 0.96 | 0.96 | 0.35 | UNETR33 | 0.66 | 0.88 | 0.89 (+ 0.54) |
 Liver (1600, 640) | 0.93 | 0.93 | 0.93 | 0.94 | 0.98 | uNet/uNet +  + /DeepLABv3 +35 | 0.47 | 0.91 | 0.96 (− 0.02) |
Weighted mean Dice coefficient for dataset (for tissue classes) | Â | Â | Â | Â | Â | Â | Â | Â | 0.94 |
Dresden (621, 851) | |||||||||
 Abdominal wall†(56, 114) | 0.93 | 0.92 | 0.93 | 0.90 | 0.91 | SegFormer36 | 0.48 | 0.56 | 0.81 (− 0.10) |
 Colon (52, 121) | 0.89 | 0.90 | 0.89 | 0.91 | 0.79 | DeepLABv336 | 0.31 | 0.83 | 0.92 (+ 0.13) |
 Inferior mesenteric artery (61, 44) | 0.77 | 0.78 | 0.77 | 0.79 | 0.6 | SegFormer36 | 0.25 | 0.77 | 0.86 (+ 0.26) |
 Intestinal veins (49, 52) | 0.74 | 0.81 | 0.78 | 0.78 | 0.65 | SegFormer36 | 0.20 | 0.72 | 0.74 (+ 0.09) |
 Liver†(83, 81) | 0.93 | 0.93 | 0.93 | 0.94 | 0.83 | SegFormer36 | 0.42 | 0.88 | 0.96 (+ 0.13) |
 Pancreas (42, 51) | 0.84 | 0.87 | 0.86 | 0.88 | 0.47 | SegFormer36 | 0.22 | 0.78 | 0.81 (+ 0.34) |
 Small intestine (53, 108) | 0.95 | 0.95 | 0.95 | 0.96 | 0.89 | SegFormer36 | 0.31 | 0.90 | 0.95 (+ 0.06) |
 Spleen (50, 14) | 0.97 | 0.96 | 0.97 | 0.97 | 0.85 | SegFormer36 | 0.40 | 0.87 | 0.91 (+ 0.06) |
Stomach (78, 129) | 0.90 | 0.91 | 0.90 | 0.93 | 0.75 | SegFormer36 | 0.36 | 0.91 | 0.91 (+ 0.16) |
 Ureter (43, 71) | 0.50 | 0.54 | 0.57 | 0.54 | 0.58 | SegFormer36 | 0.28 | 0.68 | 0.75 (+ 0.17) |
 Vesicular glands (54, 66) | 0.76 | 0.80 | 0.80 | 0.78 | 0.43 | SegFormer36 | 0.20 | 0.63 | 0.74 (+ 0.31) |
Weighted mean Dice coefficient for dataset (for tissue classes) | Â | Â | Â | Â | Â | Â | Â | Â | 0.89 |
UreterUD (115, 114) | |||||||||
 Uterine artery (50, 22) | 0.89 | 0.88 | 0.90 | 0.89 | 0.92 | U-Net23 | 0.19 | 0.79 | 0.86 (− 0.06) |
 Nerve (30, 38) | 0.71 | 0.72 | 0.78 | 0.78 | 0.90 | U-Net23 | 0.23 | 0.77 | 0.76 (− 0.14) |
 Ureter (35, 54) | 0.86 | 0.87 | 0.86 | 0.86 | 0.89 | U-Net23 | 0.36 | 0.90 | 0.91 (+ 0.02) |
Weighted mean Dice coefficient for dataset (for tissue classes) | Â | Â | Â | Â | Â | Â | Â | Â | 0.85 |
Endoscapes (226, 183) | Only single metric reported | Â | Â | ||||||
 Cystic artery (34, 47) | 0.66 | 0.71 | 0.73 | 0.72 | 0.74 | Mask2Former24 | 0.32 | 0.70 | 0.75 (+ 0.01) |
 Cystic duct (65, 53) | 0.72 | 0.77 | 0.75 | 0.79 | 0.74 | Mask2Former24 | 0.35 | 0.55 | 0.74 (0.00) |
 Cystic plate (29, 18) | 0.72 | 0.77 | 0.79 | 0.75 | 0.74 | Mask2Former24 | 0.38 | 0.59 | 0.74 (0.00) |
 Gallbladder†(73, 58) | 0.84 | 0.83 | 0.85 | 0.84 | 0.74 | Mask2Former24 | 0.33 | 0.58 | 0.80 (+ 0.06) |
 Hepatocystic triangle (25, 7) | 0.64 | 0.67 | 0.67 | 0.62 | 0.74 | Mask2Former24 | 0.34 | 0.78 (+ 0.04) | 0.66 |
Weighted mean Dice coefficient for dataset (for tissue classes) | Â | Â | Â | Â | Â | Â | Â | Â | 0.75 |
m2caiSeg (317, 154) | |||||||||
 Artery†(11, 12) | 0.71 | 0.74 | 0.72 | 0.69 | 0.16 | Custom encoder-decoder CNN25 | 0.22 | 0.80 (+ 0.64) | 0.78 (+ 0.62) |
 Fat†(28, 30) | 0.78 | 0.79 | 0.81 | 0.78 | 0.71 | Custom encoder-decoder CNN25 | 0.20 | 0.61 | 0.80 (+ 0.09) |
 Gallbladder†(249, 49) | 0.81 | 0.82 | 0.83 | 0.79 | 0.7 | Custom encoder-decoder CNN25 | 0.33 | 0.55 | 0.82 (+ 0.12) |
 Intestine†(7, 9) | 0.90 | 0.89 | 0.87 | 0.90 | 0.33 | Custom encoder-decoder CNN25 | 0.29 | 0.78 | 0.84 (+ 0.51) |
 Liver†(29, 31) | 0.88 | 0.87 | 0.88 | 0.90 | 0.87 | Custom encoder-decoder CNN25 | 0.35 | 0.74 | 0.90 (+ 0.03) |
 Upper wall (22, 23) | 0.96 | 0.95 | 0.96 | 0.95 | 0.58 | Custom encoder-decoder CNN25 | 0.42 | 0.91 | 0.96 (+ 0.38) |
Weighted mean dice coefficient for dataset (for tissue classes) | Â | Â | Â | Â | Â | Â | Â | Â | 0.85 |
Weighted mean dice coefficient (for all tissue classes)††| 0.91 | 0.92 | 0.92 | 0.92 | – |  | 0.38 | 0.85 | 0.91 |
Mean Dice coefficient (for all tissue classes) | 0.83 | 0.85 | 0.85 | 0.85 | – |  | 0.33 | 0.77 | 0.85 |