Fig. 5 | Scientific Reports

Fig. 5

From: Visual feature-based multi-scale hybrid attention network for fine-grained Hawthorn varieties identification

Fig. 5

The spatial local attention module. (A) is our proposed SLA. The input feature map X was processed to obtain the feature vector of \(\:1\times\:1\:{C}^{{\prime\:}}\)by the MaxPool layer and AvgPool layer. Meanwhile, X was compressed by \(\:1\times\:1\) Conv and then normalized by the Softmax layer. This can obtain the self-attention weight of spatial dimension. It can highlight the low-frequency regions of the image and reduce the loss of local features in the Maximum pooling. (B) is the Spatial Attention Module (SPA) in the Convolutional Block Attention Module (CBAM).

Back to article page