Fig. 7: Visualization of activation maps.

Grad-CAM was used to interpret the mammogram patterns. The Transformer localized tumors more accurately for tumor size prediction and assigned higher importance to tumor and peritumor regions for predicting lymph node metastasis (LNM) and lymphovascular invasion (LVI) compared to the ResBlock model. Both selected patients A and B were LNM positive and LVI positive showing mediolateral views with visible lesions to demonstrate the activation maps of the Transformer and the ResBlock model (columns 1–2 for patient A and columns 3–4 for patient B). Model predictions were denoted as p. The expectations of LNM and LVI predictions for the Transformer model were 0.239 and 0.158, respectively, and 0.240 and 0.178 for the ResBlock model.