Table 3 Comparison between the Zero-Shot detection capabilities of advanced multimodal models and our proposed method on the MMW-M dataset.

From: Open vocabulary detection for concealed object detection in AMMW image

Method

P

R

map@0.5

map@[0.5-0.95]

CLIP

38.2

34.6

37.9

26.2

Grounding DINO

35.6

32.5

32.7

22.9

Swin-T

29.5

23.8

27.2

16.3

Yolo-World

44.8

41.2

47.4

33.9

Ours

58.9

54.3

57.8

43.4

  1. Significant values are in bold.