Table 3 Comparison of baselines with our method in different network backbones on the FLIR dataset.

From: Multimodal fusion transformer network for multispectral pedestrian detection in low-light condition

Detector

Backbone

Methods

Params(M)↓

mAP50↑

mAP75↑

mAP↑

YOLOv11

ResNet50

Baseline

48.73

69.5

29.5

34.8

Ours

103.32

72.1

31.2

35.6

VGG16

Baseline

40.53

68.5

27.9

32.8

Ours

83.17

70.1

29.4

34.4

CSPDarkNet53

Baseline

3.95

72.1

31.3

36.8

Ours

12.41

75.4

34.1

39.4