Extended Data Fig. 6: Comparison of counts of behaviours between SIPEC:BehaveNet, pose estimation based approach and human raters.

Unsupported and supported rears and grooming events were counted per video for n = 20 different mice videos. Behaviours were integrated over multiple frames, as described in Sturman et al. Behavioural counts of 3 different human expert annotators were averaged (in legend as ‘human ground truth’). No significant differences were found for comparing the number of behaviours between SIPEC:BehaveNet and human annotators or Sturman et al. and human annotators (Tukey’s multiple comparison test). All data is represented by mean, showing all points.