Table 4 Methods and team scores for the semantic segmentation of artefacts.

Team	Method	Nature	Backbone	Evaluation metric
Team	Method	Nature	Backbone	DSC	Jaccard	Overlap	F2-score	PPV	Recall	s-score
yangsuhui	DeepLabV3+	Ensemble	ResNet-101 + MobileNetv2	0.6810	0.6416	0.6612	0.6779	0.8789	0.7148	0.6654
swtnb	Mask R-CNN+YOLOv3	Symbiosis	ResNet-101	0.6496	0.6041	0.6269	0.6585	0.7515	0.7594	0.6348
YWa	—	—	—	0.6392	0.6021	0.6206	0.6243	0.9039	0.6602	0.6216
VegZhang	—	—	—	0.6141	0.5831	0.6185	0.6185	0.8386	0.6839	0.6036
michaelqiyao	PSPNet	Pyramid pooling	ResNet-34	0.6141	0.5787	0.5964	0.6171	0.8164	0.6987	0.6016
Ig920810	—	—	—	0.6079	0.5684	0.5882	0.5972	0.8189	0.6802	0.5904
Weiminson	—	—	—	0.6011	0.5631	0.5821	0.5839	0.8375	0.6598	0.5825
ZhangPY	Mask-aided R-CNN	Symbiosis	ResNet-101	0.5719	0.5397	0.5558	0.5701	0.7719	0.6581	0.5594
nqt52798669	Cascaded R-CNN +DLA	Ensemble	ResNet-101 + DLA60	0.5414	0.4998	0.506	0.5331	0.6290	0.6887	0.5237
ShuganYang	U-Net-D	Semantic	ResNet-50	0.4119	0.3797	0.3958	0.3998	0.6407	0.6360	0.3968
Baseline	U-Net	Semantic	FCN	0.5490	0.5030	0.5260	0.5580	0.6691	0.7488	0.5340
Super Baseline	Merged	Semantic	—	0.6782	0.6356	0.6569	0.6703	0.8747	0.7178	0.6603

Teams are ordered by decreasing s-score. Off-the-shelf U-Net²⁴ is reported as a baseline (as labeled) for comparison. We also include the performance of a super segmentation denoted “Merged” constructed by keeping the consensus predicted segmentations from all teams. The most popular architectures were variations of the popular two-stage Mask R-CNN¹⁵ network (swtnb, ZhangPY). The deep encoder-decoder DeepLabV3 + ¹⁹ of yangsuhui obtained the highest s-score. However, YWa scored the highest values for PPV. ‘—’ denotes missing information.

Quick links

Search