Table 2 MAP results of different methods (TVGraz).

From: Cross-modal semantic autoencoder with embedding consensus

\(\hbox {R}=40\)

Image-text

Text-image

Average

\(\hbox {R}=\hbox {all}\)

Image-text

Text-image

Average

CCA

0.629

0.624

0.627

CCA

0.612

0.603

0.619

BLM

0.637

0.625

0.634

BLM

0.623

0.618

0.626

LCFS

0.647

0.647

0.651

LCFS

0.637

0.625

0.634

LGCFL

0.658

0.641

0.653

LGCFL

0.649

0.636

0.641

JFSSL

0.654

0.645

0.656

JFSSL

0.654

0.649

0.657

CSAEC

0.672

0.653

0.671

CSAEC

0.663

0.659

0.674