Table 4 Ablation study of the MARNN-FRFICP methodology on Flickr8k dataset.

From: An innovative multi-head attention mechanism-driven recurrent neural network model with feature representation fusion for enhanced image captioning to assist individuals with visual impairments

Flickr8K Dataset

Technique

BLEU_1

BLEU_2

BLEU_3

BLEU_4

METEOR

CIDEr

GF

76.33

59.43

52.31

41.38

39.13

59.82

InceptionResNetV2

76.84

60.20

53.16

42.11

40.00

60.37

CvT

77.45

60.90

53.72

42.78

40.61

61.10

DenseNet169

78.28

61.52

54.53

43.54

41.14

61.99

LOA

78.82

62.07

55.39

44.16

41.94

62.63

MH-BLG

79.47

62.73

56.13

45.00

42.74

63.25

MARNN-FRFICP

80.10

63.55

56.64

45.78

43.54

63.97