Table 3 The assessment of the abstract by four different methods other than academicians.

From: Identification of dental related ChatGPT generated abstracts by senior and young academicians versus artificial intelligence detectors and a similarity detector

Variable

GPT-2 output detector (n)

Ā 

Abstract type

Low fake

Moderate fake

High fake

Very high fake

Pearson Chi-square *

P-value^

Phi value

Original abstract

66

7

2

5

7.281

0.063

0.213

AI abstract

53

12

9

6

Ā Ā Ā 

Variable

Writefull GPT detector (n)

Ā Ā 

Abstract type

Entirely human

Mostly human made

Partly by AI

Entirely by AI

Pearson Chi square

P-value

Phi value

Original abstract

62

2

5

11

18.705

< 0.001

0.342

AI abstract

38

13

14

15

Variable

GPTZero detector (n)

Abstract type

Low fake

Moderate fake

High fake

Very high fake

Pearson Chi-square

P-value^

Phi value

Original abstract

80

0

0

0

144.762

< 0.001

0.951

AI abstract

4

11

13

53

Ā Ā Ā 

Variable

Similarity outcome (n)

Ā Ā Ā Ā 

Abstract type

Low similarity

Moderate similarity

High similarity

Very high similarity

Pearson Chi square

P-value

Phi value

Original abstract

0

0

0

80

144.762

< 0.001

0.951

AI abstract

23

42

11

4

  1. *Cells (less than 20%) have expected count less than 5.
  2. ^Significance level is at 0.05.