Table 5 Results of the Shapiro-Wilk test for normality for each metric and prompt. The W statistic measures how well the data follows a normal distribution, with a lower p-value indicating a departure from normality.

From: Large scale summarization using ensemble prompts and in context learning approaches

Metric

Prompt

W

p-value

SSF

Vanilla

0.9791

0.05279

CoD

0.9790

0.05243

Ev2

0.9548

0.0579

RLF

Vanilla

0.7753

\(3.04 \times 10^{-7}\)

CoD

0.5440

\(4.13 \times 10^{-11}\)

Ev2

0.6992

\(1.01 \times 10^{-8}\)

Cons

Vanilla

0.7808

\(3.99 \times 10^{-7}\)

CoD

0.5649

\(7.97 \times 10^{-11}\)

Ev2

0.6840

\(5.48 \times 10^{-9}\)

RDF-CRTD

Vanilla

0.8765

\(1.02 \times 10^{-4}\)

CoD

0.6396

\(1.02 \times 10^{-9}\)

Ev2

0.9423

0.0181

BAA

Vanilla

0.7064

\(1.37 \times 10^{-8}\)

CoD

0.5542

\(5.68 \times 10^{-11}\)

Ev2

0.7420

\(6.35 \times 10^{-8}\)

SUSWIR

Vanilla

0.9104

0.0012

CoD

0.9535

0.0513

Ev2

0.9088

0.0011

NIC

Vanilla

0.4116

\(9.68 \times 10^{-13}\)

CoD

0.9541

0.0544

Ev2

0.9370

0.0114