Table 5 Pseudocode of the multi-scale channel attention module (MS-CAM).
Algorithm 2 Multiscale Channel Attention Module (MS-CAM) | |
|---|---|
Input: Feature map \({\text{X}} = [{\text{x}}_{1} ,{\text{x}}_{2} , \ldots ,{\text{x}}_{C} ] \in {\mathbb{R}}^{C \times H \times W}\) | |
Parameters: | |
           \(\mathcal{B}\) denotes the batch normalization (BN) layer            \(\delta\) denotes the linear rectifier function (ReLU)            \({\mathcal{V}}\) denotes the view function            \(\sigma\) denotes the sigmoid function            \(\oplus\) denotes the additive function            \(\otimes\) denotes the multiplicative function | |
1 | // Local Attention |
2 | \(Y_{1} = \mathcal{B}({\text{PWConv}}_{1} (X))\) |
3 | \(Y_{2} = \delta (Y_{1} )\) |
4 | \({\text{L}} (X) = \mathcal{B}({\text{PWConv}}_{2} (Y_{2} ))\) |
5 | // Efficient Channel Attention |
6 | \(Y_{1} = {\mathcal{V}}({\text{Avgpool}} (X))\) |
7 | \(Y_{2} = {\mathcal{V}}(\sigma (1DConv(Y_{1} )))\) |
8 | \({\text{E}} (X) = Y_{2} \otimes X\) |
9 | // MS-CAM |
10 | \({\text{M}} ({\text{X}} ) = X \otimes \sigma ({\text{L}} (X) \oplus {\text{E}} (X))\) |
Output: the outputs after the filters as \({\text{M}} ({\text{X}} ) \in {\mathbb{R}}^{C \times H \times W}\) | |