Figure 4 | Scientific Reports

Figure 4

From: Multimodal masked siamese network improves chest X-ray representation learning

Figure 4

Visualization of predictions using attention heat maps. Examples of attention maps generated from the last layer of the ViT-S backbone along with the prediction scores. We compare predictions across five labels: Edema, No finding, Pleural effusion, Pneumonia, and Support devices. (a) The original CXR image. (b) The attention maps generated by the vanilla MSN (best-performing baseline). (c) The attention maps generated by MSN\(+ x_{sex}\) (best-performing proposed variant).

Back to article page