Table 2 Test set performance accuracy and 95% confidence intervals of hand- versus automated-optimized models for 1st and 2nd segmentation with radiologist comparison.
ROC AUC | PR AUC | Accuracy | P value | Sensitivity | P value | Specificity | P value | Kappa | |
---|---|---|---|---|---|---|---|---|---|
Segmentation 1 | |||||||||
Radiomics pipeline (VBFS + LR) | 0.80 | 0.81 | 0.73 (0.59–0.84) | 0.05 | 0.75 (0.53–0.89) | 0.58 | 0.71 (0.52–0.85) | 0.007 | 0.45 |
TPOT | 0.76 | 0.76 | 0.73 (0.59–0.84) | 0.05 | 0.65 (0.43–0.82) | 0.10 | 0.79 (0.61–0.90) | 0.25 | 0.44 |
Segmentation 2 | |||||||||
Radiomics pipeline (VBFS + LR) | 0.79 | 0.80 | 0.75 (0.61–0.85) | 0.11 | 0.70 (0.48–0.86) | 0.26 | 0.79 (0.61–0.90) | 0.25 | 0.49 |
TPOT | 0.79 | 0.77 | 0.75 (0.61–0.85) | 0.11 | 0.75 (0.53–0.89) | 0.58 | 0.75 (0.56–0.88) | 0.08 | 0.49 |
Radiologist 1 | NA | NA | 0.77 (0.63–0.87) | 0.11 | 0.75 (0.53–0.89) | 0.58 | 0.79 (0.61–0.90) | 0.25 | 0.53 |
Radiologist 2 | NA | NA | 0.83 (0.70–0.91) | 0.56 | 0.80 (0.58–0.93) | 1.00 | 0.86 (0.68–0.95) | 0.78 | 0.66 |
Radiologist 3 | NA | NA | 0.88 (0.76–0.95) | 0.69 | 0.80 (0.58–0.93) | 1.00 | 0.93 (0.76–0.99) | 0.57 | 0.74 |
Radiologist 4 | NA | NA | 0.88 (0.76–0.95) | 0.69 | 0.85 (0.63–0.96) | 0.78 | 0.89 (0.72–0.97) | 0.79 | 0.74 |
Mean radiologist | NA | NA | 0.84 (0.71–0.92) | 1.00 | 0.80 (0.58–0.93) | 1.00 | 0.87 (0.69–0.96) | 1.00 | NA |