Table 2 Quantitative evaluation results on internal and external datasets.
Dataset | Videos (n) | ROC AUC (%) | Average precision (%) | F1-score (%) | Precision (%) | Recall (%) |
---|---|---|---|---|---|---|
Test | 12 | 99.97 | 99.94 | 99.50 | 99.69 | 99.31 |
Gastric bypassa | 10 | 99.94 ± 0.07 | 99.42 ± 0.26 | 96.10 ± 1.53 | 98.39 ± 0.81 | 93.96 ± 3.67 |
Center 1 | 5 | 99.99 | 99.60 | 97.18 | 97.82 | 96.55 |
Center 2 | 5 | 99.89 | 99.23 | 95.01 | 98.96 | 91.36 |
Cholecystectomya | 20 | 99.71 ± 0.40 | 98.66 ± 1.27 | 94.74 ± 2.17 | 95.43 ± 5.62 | 94.37 ± 4.01 |
Center 3 | 5 | 99.83 | 99.00 | 92.78 | 87.20 | 99.12 |
Center 4 | 5 | 99.92 | 98.93 | 96.27 | 97.79 | 94.80 |
Center 5 | 5 | 99.12 | 96.85 | 92.98 | 96.95 | 89.33 |
Center 6 | 5 | 99.97 | 99.85 | 96.93 | 99.79 | 94.22 |