Table 3 Comparison of baselines with our method in different network backbones on the FLIR dataset.
From: Multimodal fusion transformer network for multispectral pedestrian detection in low-light condition
Detector | Backbone | Methods | Params(M)↓ | mAP50↑ | mAP75↑ | mAP↑ |
---|---|---|---|---|---|---|
YOLOv11 | ResNet50 | Baseline | 48.73 | 69.5 | 29.5 | 34.8 |
Ours | 103.32 | 72.1 | 31.2 | 35.6 | ||
VGG16 | Baseline | 40.53 | 68.5 | 27.9 | 32.8 | |
Ours | 83.17 | 70.1 | 29.4 | 34.4 | ||
CSPDarkNet53 | Baseline | 3.95 | 72.1 | 31.3 | 36.8 | |
Ours | 12.41 | 75.4 | 34.1 | 39.4 |