Table 1 Deep learning for computer-aided gastrointestinal endoscopy: target disease, method, dataset and outcome summaries of selected comprehensive studies.

From: Where do we stand in AI for endoscopic image analysis? Deciphering gaps and future directions

Type proc.

Organ

Mod.

Target disease

Dataset

Method

Outcome

Similar studies

OGD

O

WL

BE

Train: 494,364

Test: 1704 (669 patients)

Classificationa1—Neoplasia vs NDBE (hybrid ResNet-UNet)

(DS 4) sensitivity: 90%, specificity: 88%, accuracy: 89%

(DS 5) sensitivity: 93%, specificity: 83%, accuracy: 88%

Ebigbo et al.2 (ResNet100)

OGD

O

NBI

SCC

Train: 6473 images

Test: 6671 images and 80 videos

Segmentation39 (SegNet)

(Per-image) sensitivity: 98.04%, specificity: 95.03%

(Per-frame) sensitivity: 91.5%, specificity: 99.9%

Nakagawa et al.116, Sho et al.117 (SSD)

Everson et al.5 (Deep supervision)

OGD

S

WLI

AG

5470 images

Train: 70%

Test: 30%

Classification3 (DenseNet121)

Sensitivity: 94.5%, specificity: 94%, accuracy: 94.2%

Guimarães et al.4 (VGG16)

OGD

S

WLI

AG, IM, erosion and hem.

Train: 7326 images

Val: 815 images

Test: 570 images, 258 external test and 80 videos

Classificationa41 (UNet++, ResNet50)

Accuracy (non AG/AG, atrophy/IM, and erosion/haemorrhage): 88.78%, 87.40% and 93.67% (int. test), 91.23%, 85.81% and 92.70% (ext. test) and 95.00 %, 92.86 %, and 94.74% (video)

Zhao et al.94 (UNet)b,c

Colon

CR

WL

Polyp

Train: 411 clips

Test: 135 clips (videos)

Frame-level polyp/non-polyp classification42 (3D CNN, binary)

Sensitivity: 90%, specificity: 63%, accuracy: 76%, FP: 60

Kim et al.118 (TL: AlexNet)

Colon

CR

WL, NBI

Polyp

Train: 8641 images

Test: 1330 images and 11 videos

Polyp detection with localisation43 (YOLO; VGG16 (A1), VGG19 (A2) and ResNet50 (A3))

(A2) sensitivity: 90%, specificity: 95.2%, AUC: 0.991, accuracy: 96%, FP: 7

Yamada et al.119 (Faster R-CNN)

Klare et al.c95

Colon

CR

WL, NBI

Polyp

Train: 20,431 images

Test: 7077 images (1172 polyps)

Detection6 for polyp characterisation (SSD)

(WL) sensitivity: 90%, PPV: 83%

(NBI) sensitivity: 97%, PPV: 98%

Lee et al.120d

Zachariah et al.121

Colon

CR

NBI

Polyp

Train: 1100 (adem.) and 1050 (hyp.)

Test: 300 images (180: adem. and 120 hyp.)

Classification9 for polyp characterisation (AutoML)

Sensitivity: 83.3%, specificity: 91.7%, accuracy: 86.7%

Song et al.8 (CNN) Byrne et al.7 (CNN)

Colon

CR

WL

IBD (UC)

1651 images

Train: 80%

Val: 10%

Test: 10% and 30 videos

Classification11 into MCES scoring (159-layer CNN)

Sensitivity: 83%, specificity: 96%

PPV: 86%, NPV: 94%

Ozawa et al.44 (GoogLeNet)

Becker et al.a45 (CNN)

Colon

CR

WL

CRC

Train: 464,105

Test: TCH: 20,783, TFCH: 15,441 and TGH: 48,391

Classification48 benign/malignant (169-layer DenseNet) (CRCNet)

(Test set: sensitivity, specificity)

TCH: 90.4%, 85.3%

TFCH: 78.9%, 95.0%

TGH: 74.6%, 99.2%

Ito et al.122 (AlexNet)

  1. OGD oesophago-gastro-duodenoscopy, DNN deep neural network, CNN convolutional neural network, WLI white light imaging, NBI narrow band imaging, PPV positive predictive value, NPV negative predictive value, O oesophagus, CR colorectal, IBD inflammatory bowel disease, UC ulcerative colitis, MCES Mayo Clinic Endoscopic Subscore, SSD Single Shot MultiBox Detector, A1–A3 architectures from 1 to 3, TCH Tianjin Cancer Hospital, TFCH Tianjin First Central Hospital, TGH Tianjin General Hospital.
  2. aMultisite study.
  3. bComparative: DL vs endoscopists.
  4. cProspective study.
  5. dPublic dataset.