Table 2 Comprehensive comparison on ADAM, IntrA, and CQ500
Method | Backbone | DSC (%) ↑ | HD95 (mm) ↓ | Sens (%) ↑ | FP/case ↓ | FROC-AUC ↑ | Params (M) ↓ | FLOPs (G) ↓ |
|---|---|---|---|---|---|---|---|---|
CNN/Transformer baselines | ||||||||
3D U-Net | CNN | 73.2 (72.1–74.3) | 5.8 (5.5–6.1) | 71.0 (69.8–72.2) | 1.45 (1.38–1.52) | 0.742 | 34.5 | 168.2 |
nnU-Net | CNN | 77.9 (76.8–79.0) | 4.9 (4.7–5.1) | 75.6 (74.3–76.9) | 1.21 (1.15–1.27) | 0.781 | 31.8 | 165.1 |
UNETR | ViT | 79.1 (78.0–80.2) | 4.7 (4.5–4.9) | 77.3 (76.0–78.6) | 1.18 (1.11–1.25) | 0.796 | 87.3 | 215.4 |
Swin-UNETR | Swin-T | 80.2 (79.1–81.3) | 4.5 (4.3–4.7) | 78.8 (77.5–80.1) | 1.12 (1.06–1.18) | 0.812 | 62.7 | 190.3 |
MedNeXt | CNN++ | 81.0 (79.9–82.1) | 4.4 (4.2–4.6) | 79.2 (78.0–80.4) | 1.09 (1.03–1.15) | 0.821 | 102.1 | 240.8 |
Foundation/Promptable models | ||||||||
MedSAM | SAM | 78.6 (77.4–79.8) | 4.9 (4.7–5.1) | 76.5 (75.1–77.9) | 1.34 (1.27–1.41) | 0.772 | 91.5 | 220.1 |
SAM-Med3D | SAM-3D | 80.9 (79.8–82.0) | 4.6 (4.4–4.8) | 78.9 (77.6–80.2) | 1.20 (1.14–1.26) | 0.801 | 93.2 | 224.5 |
Self-supervised/MAE | ||||||||
Vanilla MAE | ViT | 80.1 (79.0–81.2) | 4.5 (4.3–4.7) | 78.4 (77.1–79.7) | 1.14 (1.08–1.20) | 0.808 | 87.3 | 215.4 |
Med-MAE | ViT | 81.4 (80.3–82.5) | 4.2 (4.0–4.4) | 80.0 (78.8–81.2) | 1.05 (0.99–1.11) | 0.828 | 87.3 | 215.4 |
Domain generalization baselines | ||||||||
Meta-DG | CNN | 78.5 (77.3–79.7) | 4.7 (4.5–4.9) | 76.8 (75.5–78.1) | 1.28 (1.21–1.35) | 0.782 | – | – |
DG Survey SOTA | ViT | 80.7 (79.6–81.8) | 4.4 (4.2–4.6) | 79.1 (77.9–80.3) | 1.15 (1.09–1.21) | 0.810 | – | – |
AMAP (ours) | ViT+Prompt+DG | 84.6 (83.7–85.5)* | 3.9 (3.7–4.1)* | 83.1 (82.0–84.2)* | 0.89 (0.84–0.94)* | 0.861* | 88.1 | 216.2 |