Table 3 The test results (%) of the experiments conducted using ResNet50, Swin Transformer, and EfficientNetV2-M as backbones are obtained at data volumes of 1/2, 1/4, 1/8, 1/12, and 1/16.
Dataset | Backbone | Top-1 predicted verb | Top-5 predicted verbs | Ground-truth verb | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Verb | Value | Value-all | Grnd value | Grnd value-all | Verb | Value | Value-all | Grnd value | Grnd value-all | Value | Value-all | Grnd value | Grnd value-all | ||
1/2 | ResNet50 | 37.71 | 29.52 | 17.43 | 23.55 | 9.42 | 66.15 | 50.48 | 28.07 | 39.97 | 14.78 | 72.74 | 36.92 | 56.87 | 19.04 |
Swin Transformer | 41.88 | 33.44 | 20.44 | 27.09 | 11.4 | 70.31 | 54.96 | 32.18 | 44.28 | 17.73 | 75.09 | 40.6 | 59.72 | 21.85 | |
EfficientNetV2-M | 42.38 | 33.19 | 19.42 | 23.51 | 7.71 | 70.82 | 54.22 | 30.23 | 37.96 | 11.73 | 73.49 | 37.79 | 50.94 | 14.44 | |
1/4 | ResNet50 | 30.49 | 23.07 | 12.72 | 17.79 | 6.26 | 57.75 | 42.63 | 22.24 | 32.59 | 10.67 | 69.19 | 31.74 | 52.07 | 14.79 |
Swin Transformer | 36.07 | 27.75 | 15.92 | 21.98 | 8.41 | 63.09 | 47.4 | 25.61 | 37.06 | 13.1 | 71.32 | 34.75 | 55.02 | 17.39 | |
EfficientNetV2-M | 37.65 | 29.03 | 16.58 | 20.45 | 6.6 | 64.92 | 49.13 | 26.81 | 34.13 | 10.18 | 71.98 | 35.63 | 49.43 | 13.36 | |
1/8 | ResNet50 | 23.77 | 17.43 | 9.11 | 12.92 | 4.04 | 48.08 | 34.03 | 16.49 | 24.92 | 7.08 | 65.22 | 26.77 | 47.23 | 11.39 |
Swin Transformer | 29.28 | 21.99 | 11.72 | 16.31 | 5.18 | 54.66 | 39.54 | 19.8 | 29.07 | 8.52 | 67.48 | 29.58 | 49.07 | 12.59 | |
EfficientNetV2-M | 30.51 | 22.93 | 12.27 | 15.8 | 4.85 | 56.14 | 40.83 | 20.6 | 27.73 | 7.8 | 68.11 | 30.11 | 47.86 | 11.56 | |
1/12 | ResNet50 | 19.8 | 14.23 | 7.33 | 9.59 | 2.62 | 42.06 | 28.88 | 13.42 | 19.17 | 4.66 | 62.55 | 23.98 | 41.1 | 8.6 |
Swin Transformer | 25.23 | 18.44 | 9.54 | 12.35 | 3.38 | 48.57 | 34.25 | 16.41 | 22.66 | 5.69 | 64.94 | 26.72 | 42.78 | 9.65 | |
EfficientNetV2-M | 26.54 | 19.69 | 10.4 | 13.23 | 3.75 | 50.19 | 35.88 | 17.58 | 24 | 6.25 | 65.8 | 27.38 | 43.61 | 9.82 | |
1/16 | ResNet50 | 16.85 | 12.01 | 6.13 | 8.02 | 2.16 | 37.64 | 25.44 | 11.45 | 16.75 | 3.93 | 60.82 | 21.87 | 39.49 | 7.82 |
Swin Transformer | 22.57 | 16.51 | 8.65 | 10.87 | 2.79 | 45 | 31.31 | 14.69 | 20.52 | 4.88 | 63.33 | 24.31 | 41.4 | 8.57 | |
EfficientNetV2-M | 23.5 | 17.15 | 8.97 | 11.4 | 3.13 | 45.75 | 32.1 | 15.29 | 21.12 | 5.37 | 63.91 | 25.14 | 41.69 | 9.02 |