Supplementary Figure 5: Estimation accuracy improves with few samples.
From: Fast animal pose estimation using deep neural networks

a,b, Error distance distributions per body part when estimated with networks trained for 15 epochs on 10 (a) or 250 (b) labeled frames. c, Time spent labeling each frame decreases with the quality of initialization. Line and shaded regions correspond to mean and s.d., respectively. Starting frames require 115.4 ± 45.0 (mean ± s.d.) seconds to label, decreasing to 6.1 ± 7.7 s after initialization with a network trained on 1,000 labeled frames (n = 1,500 total labeled frames). d, Accuracy improvements are observed with very few labeled samples. A plateau is observed at around 150–200 frames, with marginal improvements with additional labeling. Circles denote the test set r.m.s. error for one replicate of fast training (15 epochs) at each dataset size; lines denote mean of all replicates.