Table 10 Ablation study: impact of prompting strategies on segmentation NAS performance

From: Large language models driven neural architecture search for universal and lightweight disease diagnosis on histopathology slide images

Backbone

Dataset

Prompt strategy

Iterations (↓)

API calls (↓)

Dice (%)↑

IoU (%)↑

FLOPs (G)↓

Params (M)↓

U-Net

BCSS

EP (Ours)

10

1.8

74.33***

59.68***

10.58

11.37

  

GAP

10

1.9

73.78

58.99

19.83

15.75

  

NSP

11

2.2

73.90

59.22

17.46

12.73

 

PanNuke

EP (Ours)

10

1.6

89.31***

81.35***

14.33

8.34

  

GAP

15

4.1

89.26

81.28

23.12

11.30

  

NSP

15

2.3

88.89

80.71

16.23

9.00

  1. Results are averaged over 5 runs. Best results for each metric within a group are typically achieved by EP. The Dice and IoU of EP consistently outperform the counterpart of GAP and NSP with very high significance (p < 0.001).
  2. EP Expert Prompt (Ours), GAP Generic Assistant Prompt, NSP No System Prompt.