Table 4 Model configuration.
Layer | ResNet-18 | VGG-16 | AlexNet | Baseline | |
---|---|---|---|---|---|
1 | Conv: 5\(\times\)5, 64k, 2p, 1s | Conv\(\times\)2: 3\(\times\)3, 64c, 1p, 1s | Conv: 1\(\times\)1, 48k, 1p, 1s | Conv: 1\(\times\)1, 64k, 1p, 1s | |
Max-pooling: 3\(\times\)3, 1p, 2s | Max-pooling: 3\(\times\)3, 1p, 2s | Max-pooling: 3\(\times\)3, 2s | |||
2 | Conv\(\times\)2: 3\(\times\)3, 64k, 1p, 1s | Conv: 5\(\times\)5, 128k, 1p, 1s | Conv: 5\(\times\)5, 128k, 2p, 1s | ||
Max-pooling: 2\(\times\)2, 1p, 2s | Max-pooling: 3\(\times\)3, 1p, 2s | ||||
3 | rConv: 64k, 1s | Conv\(\times\)2: 3\(\times\)3, 128c, 1p, 1s | Conv\(\times\)2: 3\(\times\)3, 192k, 1p, 1s | Conv: 1\(\times\)1, 1k, 1p, 1s | |
Ave-pooling: 3\(\times\)3, 1p, 2s | |||||
4 | Conv\(\times\)2: 3\(\times\)3, 64k, 1p, 1s | Bi-LSTM: 256 | |||
Max-pooling: 2\(\times\)2, 1p, 2s | |||||
5 | rConv: 64k, 1s | Conv\(\times\)3: 3\(\times\)3, 256c, 1p, 1s | Conv: 3\(\times\)3, 128k, 1p, 1s | FC-1: 256 | |
Max-pooling: 3\(\times\)3, 2s | |||||
6 | Conv: 3\(\times\)3, 128k, 1p, 2s | FC-1: 2048 | FC-2: 32 | ||
7 | Conv: 3\(\times\)3, 128k, 1p, 1s | rConv: 128k, 2s | FC-2: 2048 | ||
Max-pooling: 2\(\times\)2, 1p, 2s | |||||
8 | Conv\(\times\)2: 3\(\times\)3, 128k, 1p, 1s | Conv\(\times\)3: 3\(\times\)3, 512c, 1p, 1s | FC-3: 32 | ||
9 | rConv: 128k, 1s | ||||
10 | Conv: 3\(\times\)3, 256k, 1p, 2s | ||||
Max-pooling: 2\(\times\)2, 1p, 2s | |||||
11 | Conv: 3\(\times\)3, 256k, 1p, 1s | rConv: 256k, 2s | Conv\(\times\)3: 3\(\times\)3, 512c, 1p, 1s | ||
12 | Conv\(\times\)2: 3\(\times\)3, 256k, 1p, 1s | ||||
13 | rConv: 256k, 1s | ||||
Max-pooling: 2\(\times\)2, 1p, 2s | |||||
14 | Conv: 3\(\times\)3, 512k, 1p, 2s | FC-1: 4096 | |||
15 | Conv: 3\(\times\)3, 512k, 1p, 1s | rConv: 512k, 2s | FC-2: 4096 | ||
16 | Conv\(\times\)2: 3\(\times\)3, 512k, 1p, 1s | FC-3: 1000 | |||
17 | rConv: 512k, 1s | FC-4: 32 | |||
Ave-pooling: 1\(\times\)1, 1s | |||||
18 | FC-1: 1000 | ||||
19 | FC-2: 32 |