Table 4 Ablation experiment.

From: A hallucination detection and mitigation framework for faithful text summarization using LLMs

Method

ROUGE-1

ROUGE-2

ROUGE-L

FACTCC

BertScore

BartScore

CNN/Daily Mail

 ChatGPT

34.45

13.98

32.84

35.43

88.80

– 1.80

 One iteration

36.50

13.49

26.76

36.00

89.59

– 1.70

 iterations + No sort

37.80

15.88

35.77

36.61

89.91

– 1.70

Pubmed

 ChatGPT

30.16

11.04

28.15

35.10

86.05

– 1.88

 One iteration

30.98

11.33

28.75

37.78

88.14

– 1.76

 iterations + No sort

30.98

11.33

28.75

37.78

88.14

– 1.76