Table 5 Correlation analysis between reliability, quality, and readability scores of ChatGPT-4 responses.

From: Evaluation of the reliability and readability of ChatGPT-4 responses regarding hypothyroidism during pregnancy

 

mDISCERN

GQS

FRE

FKGL

GFI

CLI

SMOG

mDISCERN

N/A

p = 0.032*

p = 0.708

p = 0.968

p = 0.737

p = 0.517

p = 0.535

r = 0.491

r = 0.092

r = − 0.010

r = 0.082

r = − 0.158

r = 0.152

GQS

p = 0.032*

N/A

p = 0.833

p = 0.246

p = 0.191

p = 0.389

p = 0.129

r = 0.491

r = − 0.052

r = − 0.280

r = − 0.314

r = 0.209

r = − 0.361

FRE

p = 0.708

p = 0.833

 

p = 0.001*

p = 0.046*

p < 0.001*

p = 0.039*

r = 0.092

r = − 0.052

N/A

r = − 0.696

r = − 0.464

r = − 0.789

r = − 0.477

FKGL

p = 0.968

p = 0.246

p = 0.001*

N/A

p < 0.001*

p = 0.044*

p < 0.001*

r = − 0.010

r = − 0.280

r = − 0.696

r = 0.947

r = 0.467

r = 0.916

GFI

p = 0.737

p = 0.191

p = 0.046*

p < 0.001*

N/A

p = 0.253

p < 0.001*

r = 0.082

r = -0.314

r = − 0.464

r = 0.947

r = 0.276

r = 0.971

CLI

p = 0.517

p = 0.389

p < 0.001*

p = 0.044*

p = 0.253

N/A

p = 0.480

r = − 0.158

r = 0.209

r = − 0.789

r = 0.467

r = 0.276

r = 0.173

SMOG

p = 0.535

p = 0.129

p = 0.039*

p < 0.001*

p < 0.001*

p = 0.480

N/A

r = 0.152

r = − 0.361

r = − 0.477

r = 0.916

r = 0.971

r = 0.173

  1. mDISCERN Modified DISCERN, GQS Global quality score, FRE Flesch Reading Ease Score, FKGL Flesch-Kincaid grade level, SMOG Simple Measure of Gobbledygook, GFI Gunning FOG Index, CLI Coleman-Liau Index, N/A Not applicable. The correlation between the data was evaluated with Pearson correlation test for parametric variables and Spearman correlation test for non-parametric variables. p < 0.05 was considered statistically significant and marked with * and bolded in the table.