Table 1 Comparative results for cross-model retrieval on NUS-WIDE, MIRFLICKR-25K over proposed method for top retrieval R@5 & R@10.
From: Mutual contextual relation-guided dynamic graph networks for cross-modal image-text retrieval
Task | Model | NUS-WIDE | MIRFLICKR-25K | ||
---|---|---|---|---|---|
R@5 | R@10 | R@5 | R@10 | ||
Image-to-Text | CCA | 75.3 | 71.4 | 81.6 | 77.7 |
VSE++ | 76.7 | 75.3 | 82.8 | 79.5 | |
PiTL | 79.2 | 76.7 | 83.6 | 80.9 | |
UNITER | 81.1 | 78.4 | 85.4 | 81.6 | |
ALBEF | 82.4 | 79.8 | 87.3 | 82.6 | |
Static_CMFG | 85.2 | 81.4 | 88.8 | 84.2 | |
GAT-H | 86.4 | 83.2 | 89.5 | 85.7 | |
Dynamic_CMFG | 88.3 | 85.7 | 90.1 | 86.5 | |
Text-to-Image | CCA | 66.3 | 61.4 | 73.2 | 71.5 |
VSE++ | 69.7 | 65.5 | 75.3 | 72.1 | |
PiTL | 71.5 | 69.6 | 78.5 | 74.3 | |
UNITER | 75.4 | 72.9 | 79.2 | 75.6 | |
ALBEF | 78.2 | 76.4 | 80.1 | 77.1 | |
Static_CMFG | 79.5 | 76.9 | 81.3 | 78.5 | |
GAT-H | 79.8 | 77.01 | 81.6 | 79.1 | |
Dynamic_CMFG | 80.6 | 77.8 | 82.4 | 79.6 |