Table 2 Keypoint labeling performance among models.
From: An open-source tool for automated human-level circling behavior detection
Dataset | Full | Half | Quarter | Eighth |
|---|---|---|---|---|
# Videos train | 188 | 94 | 47 | 24 |
# Videos test | N/A | 94 | 94 | 94 |
# Networks | 1 | 10 | 10 | 10 |
Training RMSE pixels (mean (95%CI)) | 7.82 | 9.29 (8.13–10.73) | 9.84 (8.53–11.7) | 11.02 (9.11–12.91) |
Testing RMSE pixels (mean (95%CI)) | N/A | 19.37 (16.92–22.28) | 12.3 (10.51–14.4) | 14.34 (12.66–15.98) |
F1 score (mean (95%CI)) | 0.43 (0.21–0.57) | 0.39 (0.17–0.54) | 0.41 (0.19–0.56) | 0.36 (0.14–0.52) |
P-value vs full, human | N/A, 0.51 | *0.03, ***1.7E−4 | *0.03, ***1.4E−4 | *0.02, ***3.9E−5 |