Table 2 Model performance across test sets
Regression task (mRS score) | Classification task (poor functional outcome) | ||||
|---|---|---|---|---|---|
MAE | AUC | Sensitivity (%) | Specificity (%) | PPV (%) | |
AH | |||||
Pre-operative model | 1.46 (1.35–1.58) | 0.82 (0.76–0.88) | 76.8 (67.5–84.9) | 79.0 (72.6–85.6) | 64.8 (55.7–74.1) |
Post-operative model | 1.17 (1.07–1.28) | 0.86 (0.80–0.91) | 68.3 (57.8–78.4) | 86.4 (81.1–91.2) | 72.0 (62.1–81.1) |
Stacking imaging model | 0.98 (0.86–1.10) | 0.87 (0.82–0.91) | 65.8 (55.1–75.6) | 88.3 (82.8–93.2) | 73.8 (63.3–82.7) |
Clinical model | 1.10 (0.95–1.25) | 0.84 (0.78–0.89) | 57.4 (46.9–67.1) | 88.9 (83.8–93.3) | 72.3 (60.0–83.6) |
Fusion model | 0.88 (0.76–0.99) | 0.90 (0.85–0.93) | 64.7 (54.2–74.6) | 91.4 (87.0–95.5) | 79.1 (69.2–88.7) |
FY | |||||
Pre-operative model | 1.41 (1.30–1.53) | 0.82 (0.76–0.88) | 75.2 (65.2–83.8) | 78.2 (71.7–84.8) | 67.0 (57.1–76.5) |
Post-operative model | 1.07 (0.98–1.17) | 0.90 (0.85–0.94) | 83.2 (75.0–90.8) | 79.6 (72.9–85.9) | 70.2 (61.7–78.7) |
Stacking imaging model | 0.91 (0.79–1.02) | 0.91 (0.86–0.95) | 83.2 (75.0–90.5) | 81.5 (75.0–87.3) | 72.6 (63.2–81.2) |
Clinical model | 0.94 (0.81–1.08) | 0.88 (0.84–0.92) | 61.6 (51.8–72.4) | 92.8 (88.4–96.6) | 83.6 (74.2–91.9) |
Fusion model | 0.73 (0.63–0.84) | 0.93 (0.89–0.96) | 81.8 (73.1–89.5) | 91.7 (86.9–95.9) | 85.0 (76.9–92.3) |
TL | |||||
Pre-operative model | 1.47 (1.35–1.58) | 0.88 (0.83–0.92) | 77.7 (68.4–86.7) | 85.3 (79.4–91.1) | 77.8 (68.3–86.2) |
Post-operative model | 1.12 (1.01–1.23) | 0.93 (0.89–0.96) | 86.3 (78.2–93.4) | 87.7 (81.7–93.0) | 82.2 (74.1–90.1) |
Stacking imaging model | 0.87 (0.76–0.99) | 0.93 (0.89–0.96) | 85.4 (77.5–92.2) | 88.5 (82.9–93.9) | 83.2 (74.7–90.7) |
Clinical model | 0.96 (0.81–1.12) | 0.89 (0.83–0.94) | 64.4 (53.9–74.7) | 94.3 (89.8–97.7) | 88.1 (78.9–95.5) |
Fusion model | 0.74 (0.63–0.85) | 0.94 (0.90–0.97) | 84.1 (75.6–91.5) | 93.5 (88.4–97.5) | 89.6 (82.2–96.1) |
Test-Combined | |||||
Pre-operative model | 1.45 (1.38–1.52) | 0.84 (0.80–0.87) | 76.3 (71.2–81.7) | 80.5 (76.8–84.3) | 69.5 (64.2–75.3) |
Post-operative model | 1.12 (1.06–1.18) | 0.89 (0.87–0.92) | 79.2 (74.2–84.3) | 84.5 (81.0–87.8) | 74.7 (69.3–79.8) |
Stacking imaging model | 0.92 (0.86–0.99) | 0.90 (0.87–0.92) | 78.0 (72.8–83.2) | 86.2 (82.6–89.2) | 76.6 (71.1–81.7) |
Clinical model | 1.00 (0.92–1.09) | 0.87 (0.84–0.90) | 60.9 (54.8–66.9) | 91.8 (89.2–94.2) | 81.0 (75.0–86.3) |
Fusion model | 0.79 (0.72–0.85) | 0.92 (0.90–0.94) | 76.9 (71.5–82.0) | 92.0 (89.4–94.7) | 84.7 (79.9–89.1) |