Table 4 Diagnostic Performance of Reviewers and DL-DSS for GB polyps larger than 10 mm.
AUC | Comparison | Sensitivity | Specificity | Accuracy | F-1 Score | |
|---|---|---|---|---|---|---|
Step 1: Initial Image Analysis | Step 1 vs Step 2 (p value) | |||||
Reviewer A | 0.92 [0.82–0.97] | 0.96 | 90.9 (30/33) [75.7–98.1] | 77.4 (24/31) [58.9–90.4] | 89.1 (57/64) [67.6–94.4] | 0.857 [0.769–0.895] |
Reviewer B | 0.68 [0.55–0.79] | <0.001 | 69.7 (23/33) [51.3–84.4] | 54.8 (17/31) [36.0–72.7] | 62.5 (40/64) [43.9–78.7] | 0.657 [0.530–0.744] |
Reviewer C | 0.82 [0.70–0.90] | 0.04 | 100 (33/33) [89.4–100.0] | 41.9 (13/31) [24.5–60.9] | 71.9 (46/64) [58.0–81.1] | 0.786 [0.733–0.814] |
Step 2: DL-DSS Alone | ||||||
DL-DSS | 0.92 [0.82–0.97] | 78.8 (26/33) [61.1–91.0] | 87.1 (27/31) [70.2–96.4] | 82.8 (53/64) [65.5–93.6] | 0.825 [0.705–0.896] | |
Step 3: DL-DSS Aided | Step 1 vs Step 3 (p value) | |||||
Reviewer A | 0.94 [0.84–0.98] | 0.16 | 87.9 (29/33) [71.8–96.6] | 93.6 (29/31) [78.6–99.2] | 90.7 (58/64) [75.1–97.9] | 0.906 [0.807–0.953] |
Reviewer B | 0.89 [0.79–0.96] | <0.001 | 84.9 (28/33) [68.1–94.9] | 87.1 (27/31) [70.2–96.4] | 86.0 (55/64) [69.1–95.6] | 0.862 [0.756–0.917] |
Reviewer C | 0.89 [0.79–0.96] | 0.08 | 97.0 (32/33) [84.2–99.9] | 54.8 (17/31) [36.0–72.7] | 76.6 (49/31) [60.9–86.7] | 0.810 [0.743–0.825] |