Fig. 4
From: EgoVision a YOLO-ViT hybrid for robust egocentric object recognition

Key-frame selection capturing pre-, during-, and post-interactions from input videos.
From: EgoVision a YOLO-ViT hybrid for robust egocentric object recognition

Key-frame selection capturing pre-, during-, and post-interactions from input videos.