Table 4 Semantic answer similarity validation.

From: Multi-modal transformer architecture for medical image analysis and automated report generation

Model

Skip thought CS

Vector extrema

Greedy matching

BEiTGPT2

0.9829

0.9871

0.9878

DEiTGPT2

0.9761

0.98

0.9813

ViTGPT2

0.9811

0.9836

0.9882