Fig. 3: The AE framework with the self-supervised model.
From: Deep learning for video-based assessment of endotracheal intubation skills

The AE takes as input a sequence of frames, computes a low dimensional feature representation and outputs reconstructed frames. We use the low dimensional features for classification.