Table 2 Performance of the deep learning models for frame-level diagnosis in the internal and external validation.
AUC (95% CI) | Sensitivity (%) (95% CI) | Specificity (%) (95% CI) | Accuracy (%) (95% CI) | |
|---|---|---|---|---|
Internal validation | ||||
ViT-based model | 0.860 (0.855—0.866) | 77.7 (76.4–79.0) | 77.6 (77.4–77.8) | 77.6 (77.4–77.8) |
Standard CNN-based model | 0.799 (0.792–0.805) | 71.7 (70.2–74.0) | 73.8 (73.6–74.0) | 73.8 (73.6–73.9) |
External validation | ||||
ViT-based model | 0.845 (0.837–0.853) | 76.5 (74.6–78.4) | 76.0 (75.7–76.3) | 76.0 (75.7–76.3) |
Standard CNN-based model | 0.791 (0.782–0.800) | 71.4 (69.3–73.4) | 71.9 (71.6–72.3) | 71.9 (71.6–72.2) |