Table 2 Performance of the model in all cohorts.
Cohort | Accuracy | Patient-level sensitivity | Specificity | PPV | NPV | Lesion-level sensitivity | FPs/case | Dice ratio |
---|---|---|---|---|---|---|---|---|
Internal cohort 1 | 86.0% (79.5%–90.7%) | 97.3% (90.8%–99.3%) | 74.7% (63.8%–83.1%) | 79.4% (70.0%–86.4%) | 96.6% (88.3%–99.1%) | 95.6% (89.1%–98.3%) | 0.29 | 0.75 |
Internal cohort 2 | 88.6% (83.7%–92.1%) | 94.4% (87.8%–97.7%) | 83.9% (76.5%–89.5%) | 82.3% (74.1%–88.3%) | 95.0% (89.1–98%) | 84.1% (76.9%–89.5%) | 0.26 | 0.65 |
Internal cohort 3 | 83.6% (78%–88.1%) | 78.7% (66%–87.7%) | 85.5% (78.9%–90.3%) | 66.7% (54.5%–77.1%) | 91.6% (85.7%–95.2%) | 68.8% (57.3%–78.4%) | 0.19 | 0.52 |
Internal cohort 4 | 85.8% (81.8%–89.1%) | 73.6% (59.4%–84.3%) | 87.9% (3.6%–91.1%) | 50.0% (39.2%–60.8%) | 95.3% (81.8%–89.1%) | 60.6% (48.2%–71.7%) | 0.20 | 0.45 |
Internal cohort 5 | 89.2% (85.2%–92.2%) | 78.6% (48.8%–94.3%) | 89.7% (85.7%–92.7%) | 25.0% (13.7%–40.6%) | 99.0% (96.7%–99.7%) | 68.8% (41.5%–87.9%) | 0.13 | 0.47 |
NBH cohort | 81.0% (75–86%) | 84.6% (68.8%–93.6%) | 80.2% (73.3%–85.7%) | 49.3% (37%–61.6%) | 95.8% (90.8%–98.3%) | 76.1% (62.1%–86.1%) | 0.27 | 0.53 |
TJ cohort | 8% (66.9%–81.5%) | 76.1% (66.9%–83.6%) | 71.1% (53.9%–84.0%) | 88.3% (79.6%–93.7%) | 50.9% (37%–64.7%) | 73.0% (64.8–80%) | 0.44 | 0.45 |
LYG cohort | 76.6% (71.4%–81.1%) | 85.0% (72.9%–92.5%) | 74.6% (68.7%–79.7%) | 44.0% (34.9%–53.5%) | 95.5% (91.4%–97.8%) | 78.9% (67.8%–87.1%) | 0.32 | 0.56 |