Table 3 Comparison between the Zero-Shot detection capabilities of advanced multimodal models and our proposed method on the MMW-M dataset.
From: Open vocabulary detection for concealed object detection in AMMW image
Method | P | R | map@0.5 | map@[0.5-0.95] |
|---|---|---|---|---|
CLIP | 38.2 | 34.6 | 37.9 | 26.2 |
Grounding DINO | 35.6 | 32.5 | 32.7 | 22.9 |
Swin-T | 29.5 | 23.8 | 27.2 | 16.3 |
Yolo-World | 44.8 | 41.2 | 47.4 | 33.9 |
Ours | 58.9 | 54.3 | 57.8 | 43.4 |