Table 2 Keypoint labeling performance among models.

From: An open-source tool for automated human-level circling behavior detection

Dataset

Full

Half

Quarter

Eighth

# Videos train

188

94

47

24

# Videos test

N/A

94

94

94

# Networks

1

10

10

10

Training RMSE pixels (mean (95%CI))

7.82

9.29 (8.13–10.73)

9.84 (8.53–11.7)

11.02 (9.11–12.91)

Testing RMSE pixels (mean (95%CI))

N/A

19.37 (16.92–22.28)

12.3 (10.51–14.4)

14.34 (12.66–15.98)

F1 score (mean (95%CI))

0.43 (0.21–0.57)

0.39 (0.17–0.54)

0.41 (0.19–0.56)

0.36 (0.14–0.52)

P-value vs full, human

N/A, 0.51

*0.03, ***1.7E−4

*0.03, ***1.4E−4

*0.02, ***3.9E−5

  1. Subsets of our manually-labeled frames were used to train different neural network models using DeepLabCut. All models were initialized using the pretrained ResNet50 model available through DLC and trained for up to 100,000 iterations at a learning rate of 0.001. Performance was assessed using root-mean-squared error, in pixels, between model-assigned and manually-labeled snout and tailbase positions. *p < 0.05,  ***p < 0.001.