Table 1 Comparative results for cross-model retrieval on NUS-WIDE, MIRFLICKR-25K over proposed method for top retrieval R@5 & R@10.

From: Mutual contextual relation-guided dynamic graph networks for cross-modal image-text retrieval

Task

Model

NUS-WIDE

MIRFLICKR-25K

R@5

R@10

R@5

R@10

Image-to-Text

CCA

75.3

71.4

81.6

77.7

VSE++

76.7

75.3

82.8

79.5

PiTL

79.2

76.7

83.6

80.9

UNITER

81.1

78.4

85.4

81.6

ALBEF

82.4

79.8

87.3

82.6

Static_CMFG

85.2

81.4

88.8

84.2

GAT-H

86.4

83.2

89.5

85.7

Dynamic_CMFG

88.3

85.7

90.1

86.5

Text-to-Image

CCA

66.3

61.4

73.2

71.5

VSE++

69.7

65.5

75.3

72.1

PiTL

71.5

69.6

78.5

74.3

UNITER

75.4

72.9

79.2

75.6

ALBEF

78.2

76.4

80.1

77.1

Static_CMFG

79.5

76.9

81.3

78.5

GAT-H

79.8

77.01

81.6

79.1

Dynamic_CMFG

80.6

77.8

82.4

79.6