Fig. 1 | Scientific Reports

Fig. 1

From: Reducing annotation burden in physical activity research using vision language models

Fig. 1

Illustration of the computer vision approaches compared (top). Below, quartile plots12 show the five-number summary of per-participant F\(_1\)-scores for sedentary behaviour (SB), light intensity physical activity (LIPA), and moderate-to-vigorous physical activity (MVPA), for the best-performing vision-language model, LLaVA (squares), and the best-performing discriminative vision model, ViT (circles), selected via hyperparameter tuning. Performance is shown for participants in the Oxfordshire study (blue) and the Sichuan study (red) withheld from model selection. MVPA constitutes only 8% of the training set, which is reflected in the high variance of per-participant F\(_1\)-scores.

Back to article page