Table 4 Methods and team scores for the semantic segmentation of artefacts.

From: An objective comparison of detection and segmentation algorithms for artefacts in clinical endoscopy

Team

Method

Nature

Backbone

Evaluation metric

DSC

Jaccard

Overlap

F2-score

PPV

Recall

s-score

yangsuhui

DeepLabV3+

Ensemble

ResNet-101 + MobileNetv2

0.6810

0.6416

0.6612

0.6779

0.8789

0.7148

0.6654

swtnb

Mask R-CNN+YOLOv3

Symbiosis

ResNet-101

0.6496

0.6041

0.6269

0.6585

0.7515

0.7594

0.6348

YWa

0.6392

0.6021

0.6206

0.6243

0.9039

0.6602

0.6216

VegZhang

0.6141

0.5831

0.6185

0.6185

0.8386

0.6839

0.6036

michaelqiyao

PSPNet

Pyramid pooling

ResNet-34

0.6141

0.5787

0.5964

0.6171

0.8164

0.6987

0.6016

Ig920810

0.6079

0.5684

0.5882

0.5972

0.8189

0.6802

0.5904

Weiminson

0.6011

0.5631

0.5821

0.5839

0.8375

0.6598

0.5825

ZhangPY

Mask-aided R-CNN

Symbiosis

ResNet-101

0.5719

0.5397

0.5558

0.5701

0.7719

0.6581

0.5594

nqt52798669

Cascaded R-CNN +DLA

Ensemble

ResNet-101 + DLA60

0.5414

0.4998

0.506

0.5331

0.6290

0.6887

0.5237

ShuganYang

U-Net-D

Semantic

ResNet-50

0.4119

0.3797

0.3958

0.3998

0.6407

0.6360

0.3968

Baseline

U-Net

Semantic

FCN

0.5490

0.5030

0.5260

0.5580

0.6691

0.7488

0.5340

Super Baseline

Merged

Semantic

0.6782

0.6356

0.6569

0.6703

0.8747

0.7178

0.6603

  1. Teams are ordered by decreasing s-score. Off-the-shelf U-Net24 is reported as a baseline (as labeled) for comparison. We also include the performance of a super segmentation denoted “Merged” constructed by keeping the consensus predicted segmentations from all teams. The most popular architectures were variations of the popular two-stage Mask R-CNN15 network (swtnb, ZhangPY). The deep encoder-decoder DeepLabV3 + 19 of yangsuhui obtained the highest s-score. However, YWa scored the highest values for PPV. ‘—’ denotes missing information.