Table 3 Outcome metrics for standard double reading versus double reading plus the AI-assisted additional-reader workflow
From: Prospective implementation of AI-assisted screen reading to improve early detection of breast cancer
Variable | Double reading | Double reading plus the AI-assisted additional-reader workflow | Difference | ||
---|---|---|---|---|---|
Num/Denom | Value (95% CI) | Num/Denom | Value (95% CI) | ||
Results of phase 1, pilot rollout (1 site, 1 additional arbitrator, additional arbitration cases were single read), n = 3,746 screens | |||||
CDR (per 1,000 cases) | 48/3,746 | 12.8 (9.7–16.9) | 54/3,746 | 14.4 (11.1–18.8) | 1.6a |
RR (%) | 250/3,746 | 6.7 (5.9–7.5) | 256/3,746 | 6.8 (6.1–7.7) | 0.2 |
Sen (%) | 48/58 | 82.8 (71.7–90.4) | 54/58 | 93.1 (83.6–97.3) | 10.3a |
Spec (%) | 3,486/3,688 | 94.5 (93.7–95.2) | 3,486/3,688 | 94.5 (93.7–95.2) | 0.0 |
PPV (%) | 48/250 | 19.2 (14.8–24.5) | 54/256 | 21.1 (16.5–26.5) | 1.9 |
Arbitration rate (%) | 114/3,746 | 3.0 (2.5–3.6) | 510/3,746 | 13.6 (12.6–14.8) | 10.6 |
Positive discordance rate (%) | – | – | 396/3,746 | 10.6 (9.6–11.6) | – |
RR of additional arbitration (%) | – | – | 6/396 | 1.5 (0.7–3.3) | – |
PPV of additional arbitration (%) | – | – | 6/6 | 100 (61.0–100) | – |
Results of phase 2, extended pilot (4 sites, 3 additional arbitrators, all additional arbitration cases were read by each additional reader), n = 9,112 screens | |||||
CDR (per 1,000 cases) | 126/9,112 | 13.8 (11.6–16.4) | 139/9,112 | 15.3 (12.9–18.0) | 1.4a |
RR (%) | 639/9,112 | 7.0 (6.5–7.6) | 661/9,112 | 7.3 (6.7–7.8) | 0.2 |
Sen (%) | 126/145 | 86.9 (80.4–91.4) | 139/145 | 95.9 (91.3–98.1) | 9.0a |
Spec (%) | 8,454/8,967 | 94.3 (93.8–94.7) | 8,445/8,967 | 94.2 (93.7–94.6) | −0.1 |
PPV (%) | 126/639 | 19.7 (16.8–23.0) | 139/661 | 21.0 (18.1–24.3) | 1.3 |
Arbitration rate (%) | 270/9,112 | 3.0 (2.6–3.3) | 1,294/9,112 | 14.2 (13.5–14.9) | 11.2 |
Positive discordance rate (%) | – | – | 1,024/9,112 | 11.2 (10.6–11.9) | – |
RR of additional arbitration (%) | – | – | 22/1,024 | 2.1 (1.4–3.2) | – |
PPV of additional arbitration (%) | – | – | 13/22 | 59.1 (38.7–76.7) | – |
Results of phase 3, live use in standard clinical practice (4 sites, 3 additional arbitrators, additional arbitration cases were single read), n = 15,953 screens | |||||
CDR (per 1,000 cases) | 238/15,953 | 14.9 (13.2–16.9) | 249/15,953 | 15.6 (13.8–17.7) | 0.7a |
RR (%) | 1,228/15,953 | 7.7 (7.3–8.1) | 1,276/15,953 | 8.0 (7.6–8.4) | 0.3 |
Sen (%) | 238/253 | 94.1 (90.4–96.4) | 249/253 | 98.4 (96.0–99.4) | 4.3a |
Spec (%) | 14,710/15,700 | 93.7 (93.3–94.1) | 14,673/15,700 | 93.5 (93.1–93.8) | −0.2 |
PPV (%) | 238/1,228 | 19.4 (17.3–21.7) | 249/1,276 | 19.5 (17.4–21.8) | 0.1 |
Arbitration rate (%) | 529/15,953 | 3.3 (3.0–3.6) | 1,715/15,953 | 10.8 (10.3–11.2) | 7.4 |
Positive discordance rate (%) | – | – | 1,186/15,953 | 7.4 (7.0–7.9) | – |
RR of additional arbitration (%) | – | – | 48/1,186 | 4.0 (3.1–5.3) | – |
PPV of additional arbitration (%) | – | – | 11/48 | 22.9 (13.3–36.5) | – |