Table 5 Weighted mean Dice score across the test set with 10 prompt points for the fine-tuned SAM 2 model after ablating major components (image encoder, mask decoder, and prompt encoder), with fine-tuning performed using 400 training samples per class.

From: A fine-tuned foundational model SurgiSAM2 for surgical video anatomy segmentation and detection

 

Finetuned components of the SAM 2 model

Image encoder + mask decoder + prompt encoder

Mask decoder + prompt encoder

Image encoder + prompt encoder

Image encoder + mask decoder

Weighted mean dice score

0.91

0.86 (-0.05)

0.92 (+ 0.01)

0.92 (+ 0.01)

Parameters frozen

–

69.1 million

4.2 million

6000