Fig. 4

Examples of attention visualizations generated by the ViT model. Each column (a)-(f) corresponds to one case, with the first image showing the raw input, the second the attention heatmap overlay, and the third the transparency visualization. Columns (a)-(c) illustrate anemia cases, while (d)-(f) illustrate no anemia. Highlighted regions correspond to clinically meaningful cues such as conjunctival pallor, scleral hue, and vascular patterns.