Table 2 Performance of BLDS and readers in bone lesion detection

From: A clinically applicable AI system for detection and diagnosis of bone metastases using CT scans

 

WAFROC FOM (95% CI)

Lesion-wise sensitivity, % (95% CI)

Reading time, mean (SD) (s)

FPPC

BLDS

0.745

0.891 (0.885,0.897)

 

1.40

 

Without BLDS

With BLDS

P value

Without BLDS

With BLDS

P value

Without BLDS

With BLDS

P value

Without BLDS

With BLDS

Trainee 1

0.624

(0.593,0.657)

0.901

(0.934,0.869)

<0.001

0.345

(0.304,0.323)

0.865

(0.858,0.872)

<0.001

93 (80)

59 (44)

<0.001

0.37

0.58

Trainee 2

0.664

(0.634,0.693)

0.898

(0.928,0.869)

<0.001

0.459

(0.449,0.469)

0.848

(0.840,0.855)

<0.001

175(111)

99 (62)

<0.001

0.55

0.29

Trainee 3

0.642

(0.628,0.655)

0.937

(0.950,0.924)

<0.001

0.307

(0.298,0.316)

0.895

(0.889,0.902)

<0.001

100 (89)

84 (73)

<0.001

0.25

0.23

Average of trainees

0.644

(0.602,0.685)

0.912

(0.863,0.962)

0.003

0.360

(0.350,0.370)

0.869

(0.862,0.876)

<0.001

122(101)

81 (63)

<0.001

0.39

0.37

Junior radiologist 1

0.691

(0.669,0.714)

0.874

(0.852,0.897)

<0.001

0.440

(0.430, 0.451)

0.751

(0.742, 0.760)

<0.001

203 (89)

168(68)

<0.001

0.96

0.23

Junior radiologist 2

0.647

(0.625,0.670)

0.854

(0.832,0.877)

<0.001

0.349

(0.340, 0.359)

0.731

(0.722,0.740)

<0.001

114 (73)

85 (50)

<0.001

1.61

0.87

Junior radiologist 3

0.675

(0.646,0.705)

0.917

(0.888,0.947)

<0.001

0.563

(0.553,0.573)

0.891

(0.884,0.897)

<0.001

182(103)

145(84)

<0.001

0.19

0.39

Average of Junior radiologists

0.672

(0.628,0.715)

0.882

(0.803,0.961)

0.004

0.451

(0.441,0.461)

0.791

(0.783,0.799)

<0.001

166 (97)

132(77)

<0.001

0.92

0.50

Pooled readers

0.658

(0.632,0.683)

0.897

(0.866,0.928)

<0.001

0.405

(0.395,0.415)

0.83

(0.822,0.838)

<0.001

144(101)

106(75)

<0.001

0.65

0.43

  1. Statistical significance was evaluated using three distinct two-sided methods without multiplicity correction: the Wilcoxon signed-rank test for the WAFROC figure of merit (FOM), paired Student’s t-test for lesion-wise sensitivity comparisons, and Mann–Whitney U-test for reading time analysis. For all analyses, P < 0.05 was considered to indicate a statistically significant difference.
  2. WAFROC FOM weighted alternative free-response receiver operating characteristic figure of merit, FPPC false positive count per case, BLDS bone lesion detection system, CI confidence interval.