Table 7 Updated evaluation results incorporating COMET scores for enhanced cross-lingual consistency and semantic fidelity in sentence recommendations.
Metric | English dataset | Arabic dataset | Overall |
|---|---|---|---|
BLEU-4 | 78.2 | 74.1 | 76.15 |
METEOR | 68.3 | 64.6 | 66.45 |
ROUGE-L | 75.8 | 70.4 | 73.1 |
CIDEr | 136.7 | 133.2 | 134.95 |
SPICE | 52.0 | 47.5 | 49.75 |
COMET | 67.0 | 65.0 | 66.0 |