Table 9 Results of statistical significance test for human evaluation metrics.
From: Knowledge grounded medical dialogue generation using augmented graphs
MedDialog(EN) | Covid Dataset | |||||
|---|---|---|---|---|---|---|
Models | Fluency | Adequacy | Entity Relevance | Fluency | Adequacy | Entity Relevance |
BERT | 1.21E−011 | 2.03E−022 | 1.01E−062 | 4.27E−018 | 1.27E−018 | 1.47E−018 |
BART | 1.12E−011 | 1.26E−018 | 1.26E−018 | 1.56E−018 | 1.36E−018 | 2.16E−018 |
BioBERT | 1.20E−010 | 3.04E−012 | 6.04E−022 | 1.27E−017 | 1.27E−017 | 1.27E-017 |