Table 5 Performance comparison of existing CNN, hybrid CNN–transformer models, and the proposed model in centralized and federated settings.
From: Enhanced brain tumour segmentation using a hybrid dual encoder–decoder model in federated learning
Model | Architecture type | (Centralized) | (Federated) | Notes | ||
---|---|---|---|---|---|---|
Dice | IoU | Dice | IoU | |||
U-Net | CNN | 0.85 | 0.81 | 0.82 | 0.80 | Fast training, lacks context modelling |
UNet + + | CNN (nested) | 0.89 | 0.83 | 0.88 | 0.79 | Better boundaries, more computation |
ResUNet | CNN + Residuals | 0.88 | 0.82 | 0.87 | 0.79 | Improved convergence, still limited receptive field |
TransBTS29 | 3D CNN + Transformer | 0.92 | 0.86 | – | – | Effective multimodal, but high compute |
UNETR30 | CNN + Transformer Encoder | 0.90 | 0.84 | – | – | Volumetric segmentation, not privacy-focused |
3D-UNet40 | 3D CNN | – | – | 0.86 | Deep model for 3D images | |
SU–Net41 | CNN + Inception | – | – | 0.78 | – | Efficient, multi-scale receptive fields |
U-shaped model42 | CNN + Inception | – | – | 0.88 | – | Multi-encoder, lacks global features |
Proposed Model | EfficientNet + Swin + BASNet + MaskFormer | 0.94 | 0.87 | 0.94 | 0.87 | Highest performance, boundary refinement |