Table 3 Performance Comparison of Various Models Before and After Fine-Tuning on the CBL Dataset.
From: A Connected Building Landscape dataset for Instance Segmentation
Pre-trained on | Model | Backbone | AP (Before Fine-tuning) | AP (After Fine-tuning) | Inference Time (s/img) |
|---|---|---|---|---|---|
COCO | Mask R-CNN | R50-FPN | 0.0 | 39.07 | 0.058 |
R101-FPN | 0.0 | 40.67 | 0.053 | ||
Mask2Former | R50 | 0.0 | 34.57 | 0.060 | |
R101 | 0.0 | 31.52 | 0.120 | ||
Swin-T | 0.0 | 45.12 | 0.160 | ||
Swin-L | 0.0 | 48.81 | 0.230 | ||
Cityscapes | Mask R-CNN | R50-FPN | 0.0 | 38.48 | 0.046 |
Mask2Former | R101 | 0.0 | 26.75 | 0.150 | |
Swin-T | 0.0 | 41.24 | 0.150 | ||
Swin-L | 0.0 | 42.90 | 0.240 | ||
ADE20k | Mask2Former | R50 | 0.0 | 21.30 | 0.075 |
Swin-T | 0.0 | 32.89 | 0.374 | ||
Swin-L | 0.0 | 30.50 | 0.312 |