Table 1 Precision statistics.

From: Bulldogs stenosis degree classification using synthetic images created by generative artificial intelligence

Precision

Mean SK (SD)

Architecture

Training set

Real

Combined

Synthetic

GPT-4o

0.390 Ba (\(\pm 0.144\))

0.397 Ba (\(\pm 0.170\))

0.386 Aa (\(\pm 0.243\))

DenseNet201

0.512 Aa (\(\pm 0.120\))

0.612 Aa (\(\pm 0.084\))

0.325 Ab (\(\pm 0.060\))

MobileNetV3

0.565 Aa (\(\pm 0.192\))

0.358 Bb (\(\pm 0.178\))

0.313 Ab (\(\pm 0.062\))

SwinV2

0.401 Ba (\(\pm 0.233\))

0.311 Ba (\(\pm 0.092\))

0.209 Ab (\(\pm 0.143\))

MaxViT

0.183 Ca (\(\pm 0.117\))

0.303 Ba (\(\pm 0.016\))

0.284 Aa (\(\pm 0.042\))

ResNet50

0.438 Bb (\(\pm 0.162\))

0.548 Aa (\(\pm 0.045\))

0.342 Ab (\(\pm 0.106\))

Humans

0.572 Aa (\(\pm 0.131\))

Median (IQR)

Architecture

Training set

Real

Combined

Synthetic

GPT-4o

0.359 (0.122)

0.408 (0.119)

0.366 (0.297)

DenseNet201

0.514 (0.181)

0.579 (0.077)

0.331 (0.093)

MobileNetV3

0.629 (0.293)

0.319 (0.193)

0.320 (0.079)

SwinV2

0.441 (0.417)

0.318 (0.028)

0.115 (0.225)

MaxViT

0.118 (0.103)

0.307 (0.010)

0.273 (0.046)

ResNet50

0.458 (0.171)

0.536 (0.050)

0.316 (0.133)

Humans

0.590 (0.135)

  1. The results of the Scott–Knott test are shown next to the mean values. In each column, mean values indicated by the same capital letters did not differ according to the 5% significance threshold. In each row, mean values indicated by the same lowercase letters did not differ according to the same threshold.
  2. Significant values are in bold.