Fig. 7

Architectural Details of Dynamic Attention Block: The DA block adapts MobileNetV2 features via 1 Ă— 1 convolution, applies multi-head self-attention with residuals and normalization for stable learning. And refines outputs using a lightweight MLP before classification. Unlike ViTs, it avoids tokenization and positional encoding, enabling efficient global context modelling.