Fig. 6: Clinical Scenario of SleepXViT.
From: Explainable vision transformer for automatic visual sleep staging on multimodal PSG signals

Using the confidence scores generated by the model, we can first filter images with Intra-epoch ViT identifying cases with low confidence for further review. These selected samples are then examined by the Inter-epoch ViT for additional filtering. Finally, images requiring human assessment are flagged, allowing clinicians to make the final determination with the aid of heatmaps and relevance scores. With this workflow based on confidence, clinicians make the final decision and finish the staging task efficiently.