Table 2 Performance metrics for deep learning models, Mean (95% CI) if applicable.

From: Deep learning systems detect dysplasia with human-like accuracy using histopathology and probe-based confocal laser endomicroscopy

Modality

Model

Specificity

Sensitivity

PPV

NPV

Accuracy

F1 Score

pCLE

Attn

 

Dysplasia

97%

57%

40%

98%

96%

47%

Barrett's

88%

89%

94%

80%

89%

92%

Squamous

93%

90%

84%

96%

92%

87%

Weighted Average

90%

88%

89%

85%

90%

89%

MultiAttn

 

Dysplasia

92%

71%

23%

99%

91%

34%

Barrett's

91%

81%

95%

70%

85%

88%

Squamous

93%

92%

85%

96%

93%

88%

Weighted Average

92%

84%

89%

79%

87%

86%

Biopsy

Patch-level

 

Dysplasia

89% (85–93)

72% (61–83)

31% (25–37)

98% (97–99)

88% (84–91)

43% (38–47)

Barrett's

91% (89–93)

81% (74–88)

91% (89–92)

82% (77–88)

86% (83–89)

85% (82–89)

Squamous

100% (100–100)

92% (91–93)

99% (98–99)

94% (93–95)

96% (95–97)

95% (94–96)

Weighted Average

93% (91–95)

82% (75–88)

74% (76–90)

92% (89–94)

90% (87–92)

74% (71–77)

Whole-slide-image-level

 

Dysplasia

96% (92–100)

90% (79–100)

85% (58–100)

93% (80–100)

93% (90–97)

85% (70–100)

Barrett's

93% (87–99)

94% (88–100)

86% (66–100)

94% (85–100)

93% (89–96)

89% (78–100)

Squamous

100% (100–100)

97% (95–99)

100% (100–100)

99% (98–100)

99% (99–100)

99% (97–100)

Weighted Average

97% (95–99)

93% (89–96)

94% (93–96)

92% (84–100)

94% (92–97)

93% (91–95)

\(Specificity = \frac{TN}{{\left( {FP + TN} \right)}}\)

\(Sensitivity = \frac{TP}{{\left( {TP + FN} \right)}}\)

\(PPV = \frac{TP}{{\left( {TP + FP} \right)}}\)

\(NPV = \frac{TN}{{\left( {TN + FN} \right)}}\)

\(Accuracy = \frac{{\left( {TP + TN} \right)}}{{\left( {TP + FP + FN + TN} \right)}}\)

\(F1 = 2{* }\frac{PPV*Sensitivity}{{\left( {PPV + Sensitivity} \right)}}\)

TP = True Positive

FP = False Positive

TN = True Negative

FN = False Negative

PPV = Positive Predictive Value

NPV = Negative Predictive Value