Fig. 3
From: Comparison of marker-less 2D image-based methods for infant pose estimation

Difference to annotation \(d_a\) in pixels for different subjects, evaluated on our dataset and grouped by key point. Error bars represent confidence intervals of mean (95%). (a) Mean \(d_a\) of the different generic frameworks, human for comparison. (b) Mean \(d_a\) for the infant pose estimators. Generic ViTPose is also shown for comparison to non-retrained results. Note that AggPose does not estimate the positions of nose and ears.