Table 4 Ablation study results on LIDC-IDRI

Model variant	Dice (%)	IoU (%)
Full GLANCE (GCT+MRGAM+CSCF)	95.2	90.8
w/o CSCF (no cross-scale fusion)	82.0	69.5
w/o MRGAM (no atrous conv branch)	82.5	70.2
w/o GCT (no transformer branch)	80.0	66.7
w/o GCT, w/ CSCF (two local streams)^*	80.4	67.2
w/o MRGAM, w/ CSCF (conv + transformer)^*	82.2	69.8
w/o GCT & w/o CSCF (MRGAM only)	80.3	67.1
w/o MRGAM & w/o CSCF (GCT only)	81.0	68.1
w/o GCT & w/o MRGAM (dual plain encoders)	79.5	66.0
w/o GCT & w/o MRGAM & w/o CSCF (baseline U-Net)	79.0	65.3

^*These configurations retain two encoder streams. “w/o GCT, w/ CSCF” uses two identical conv-MRGAM streams fused at each scale, while “w/o MRGAM, w/ CSCF” uses one transformer and one plain conv stream.
We report segmentation accuracy as Dice similarity coefficient (DSC) and Intersection-over-Union (IoU). GCT global context transformer, MRGAM multi-receptive grouped atrous mixer, CSCF cross-scale consensus fusion.

Quick links

Search