Table 6 Structural parameters of DLoGNet

From: Towards accurate bird sound recognition through multi-scale texture-aware modeling

Layer (type)

Output shape

BDCM-1

DLoG-1

[−1, 4, 128, 128]

Conv-1 (3*3)

[−1, 64, 128, 128]

BatchNorm2d-1 + ReLU-1 + MaxPool2d-1

[−1, 64, 64, 64]

BDCM-2

DLoG-2

[−1, 256, 64, 64]

Conv-2 (3*3)

[−1, 128, 64, 64]

BatchNorm2d-2 + ReLU-2 + MaxPool2d-2

[−1, 128, 32, 32]

BDCM-3

DLoG-3

[−1, 512, 32, 32]

Conv-3 (3*3)

[−1, 128, 32, 32]

BatchNorm2d-3 + ReLU-3 + MaxPool2d-3

[−1, 128, 16, 16]

BDCM-4

DLoG-4

[−1, 512, 16, 16]

Conv-4 (3*3)

[−1, 128, 16, 16]

BatchNorm2d-4 + ReLU-4 + MaxPool2d-4

[−1, 128, 8, 8]

BDCM-5

DLoG-5

[−1, 512, 8, 8]

Conv-5 (3*3)

[−1, 64, 8, 8]

BatchNorm2d-5 + ReLU-5 + MaxPool2d-5

[−1, 64, 4, 4]

Fully connected layer-1

[−1, 1024]

Fully connected layer-2

[−1, 8]