Table 4 Sensitivity and specificity before and after arbitration

From: Impact of using artificial intelligence as a second reader in breast screening including arbitration

 

Center 1

Center 2

Both centers

n

Human arm

AI arm

n

Human arm

AI arm

n

Human arm

AI arm

Before arbitration (%)

After arbitration (%)

Before arbitration (%)

After arbitration (%)

Before arbitration

After arbitration (%)

Before arbitration (%)

After arbitration (%)

Before arbitration (%)

After arbitration (%)

Before arbitration (%)

After arbitration

Sensitivity

 Screen detected

169

100.0 (100.0, 100.0)

95.3

(92.1, 98.5)

98.8

(97.2, 100.4)

94.1

(90.5, 97.6)

183

100.0 (100.0, 100.0)

90.2

(85.8, 94.5)

99.5

(98.4, 100.5)

90.7

(86.5, 94.9)

352

100.0 (100.0, 100.0)

92.6

(89.9, 95.3)

99.1

(98.2, 100.1)

92.3

(89.5, 95.1)

 Interval cancer

68

13.2

(5.2, 21.3)

4.4

(−0.5, 9.3)

32.4

(21.2, 43.5)

10.3

(3.1, 17.5)

68

17.6

(8.6, 26.7)

7.4%

(1.1, 13.6)

32.4

(21.2, 43.5)

7.4

(1.1, 13.6)

136

15.4

(9.4, 21.5)

5.9

(1.9, 9.8)

32.4

(24.5, 40.2)

8.8

(4.1, 13.6)

 Next-round cancer

112

11.6

(5.7, 17.5)

3.6

(0.1, 7.0)

34.8

(26.0, 43.6)

8.0

(3.0, 13.1)

123

13.8

(7.7, 19.9)

7.3

(2.7, 11.9)

33.3

(25.0, 41.7)

8.1

(3.3, 13.0)

235

12.8

(8.5, 17.0)

5.5

(2.6, 8.5)

34.0

(28.0, 40.1)

8.1

(4.6, 11.6)

 Totala

349

54.7

(49.5, 60.0)

48.1

(42.9, 53.4)

65.3

(60.3, 70.3)

50.1

(44.9, 55.4)

374

56.7

(51.7, 61.7)

47.9

(42.8, 52.9)

65.5

(60.7, 70.3)

48.4

(43.3, 53.5)

723

55.7

(52.1, 59.4)

48.0

(44.4, 51.6)

65.4

(62.0, 68.9)

49.2

(45.6, 52.9)

Specificity

 Total

22,670

93.6

(93.3, 93.9)

96.4

(96.1, 96.6)

89.9

(89.5, 90.3)

97.0

(96.8, 97.3)

22,208

88.8

(88.4, 89.3)

96.6

(96.4, 96.8)

86.5

(86.1, 87.0)

96.6

(96.3, 96.8)

44,879

91.3

(91.0, 91.5)

96.5

(96.3, 96.7)

88.2

(87.9, 88.5)

96.8

(96.6, 97.0)

  1. Values in brackets show the 95% CI. For sensitivity this is given for each type of positive case. For before arbitration, recall by at least one reader was a recall outcome. The n is the number of positive or negative cases for calculation of sensitivity and specificity.
  2. aThe total sensitivity is around 50% due to the long-term follow-up such that a positive case includes screen-detected cancer, before interval cancers and before next-round screen-detected cancers.