Table 10 Result analysis of the ablation study of the MARNN-FRFICP model on MSCOCO dataset.

From: An innovative multi-head attention mechanism-driven recurrent neural network model with feature representation fusion for enhanced image captioning to assist individuals with visual impairments

MSCOCO Dataset

Technique

BLEU_1

BLEU_2

BLEU_3

BLEU_4

METEOR

CIDEr

GF

79.01

67.24

51.47

43.53

37.47

146.13

InceptionResNetV2

79.76

68.02

52.06

44.34

38.22

146.96

CvT

80.56

68.56

52.66

45.13

38.97

147.57

DenseNet169

81.42

69.15

53.22

45.84

39.60

148.31

LOA

82.15

69.73

54.12

46.36

40.26

149.20

MH-BLG

82.74

70.28

54.84

47.13

41.01

149.80

MARNN-FRFICP

83.36

70.85

55.59

47.86

41.69

150.62