Table 1 Comparison with advanced techniques on the LLVIP dataset.
From: Multimodal fusion transformer network for multispectral pedestrian detection in low-light condition
Methods | Data | Backbone | mAP50↑ | mAP↑ |
---|---|---|---|---|
Halfwayfusion41 | RGB + IR | VGG16 | 91.4 | 55.1 |
GAFF42 | RGB + IR | ResNet18 | 94.0 | 55.8 |
ProbEn43 | RGB + IR | ResNet50 | 93.4 | 51.5 |
CSAA44 | RGB + IR | ResNet50 | 94.3 | 59.2 |
RSDet40 | RGB + IR | ResNet50 | 95.8 | 61.3 |
FusionGAN45 | RGB + IR | GAN | 83.8 | 48.1 |
GANMcC46 | RGB + IR | GAN | 87.8 | 49.8 |
NestFuse47 | RGB + IR | Encoder–decoder | 86.9 | 49.7 |
DenseFuse48 | RGB + IR | Encoder–decoder | 88.2 | 50.4 |
SDNet27 | RGB + IR | – | 86.6 | 50.8 |
U2Fusion49 | RGB + IR | VGG | 87.1 | 47.6 |
DIVFusion50 | RGB + IR | Encoder–decoder | 89.8 | 52.0 |
Ours | RGB + IR | CSPDarknet53 | 96.4 | 62.7 |