Table 3 Dice scores of SAM (Segment Anything Model) with the prompt layer fine-tuned4 on various FOVEA dataset splits.

From: FOVEA: Preoperative and intraoperative retinal fundus images with optic disc and retinal vessel annotations

Data

Prep

Loss

Steps

Dice score [%] on the test split of:

i-30/10-1

p-30/10-1

i-1/39-1

crop

poll-0

poll-0.5

crop

res

crop

i-30/10-1

crop

CE

60000

49.6

54.8

49.5

25.1

44.9

 

i-30/10-1

crop

Lovász

60000

53.2

50.6

54.7

55.5

65.9

 

i-30/10-12

crop

Lovász

60000

54.5

51.1

55.9

58.3

66.6

 

p-30/10-1

crop

Lovász

60000

51.5

47.2

50.8

70.3

71.0

 

p-30/10-1

res

Lovász

60000

48.4

46.6

49.6

58.0

70.7

 

i-1/39-1

crop

Lovász

10000

51.0

50.3

51.3

51.6

57.6

49.3

30000

50.4

51.2

50.3

45.6

53.0

48.3

i-1/39-12

crop

Lovász

10000

51.4

48.3

52.2

60.9

65.2

50.7

30000

53.5

52.1

54.0

56.0

63.3

52.2

  1. Data naming follows the scheme <domain>-<train#/test#>-<annotator>. The domain is “i” for intraoperative, “p” for preoperative. “train#” and “test#” refer to the number of images in the data split. The annotator is either 1 or 2, or 12 if both are used randomly during training. Data preparation during training: “crop” for random crops, “res” for resizing. During testing: “crop” for centre crops or “poll” for polling final predictions from 5 patches distributed over the original test image and combined at each pixel based on a given majority threshold, either 0 (“poll-0”) or 0.5 (“poll-0.5”). Methods trained on 30/10 dataset splits were not evaluated on 1/39, as this would have meant testing data already seen during training. This is not an issue vice-versa: the single training record in the 1/39 split is not part of the much smaller test set in the 30/10 split.