Fig. 3: Model performance evaluation: ROC curve analysis, calibration assessment, and decision curve analysis for test and external validation datasets.
From: AI prediction model for endovascular treatment of vertebrobasilar occlusion with atrial fibrillation

a, b ROC Curve Analyses and Optimal Thresholds. a and b display the ROC curve analysis results for the test dataset (a) and the external validation dataset (b), respectively. Calculated using the DeLong method, the AUC for the test dataset is 0.719 (95% CI: 0.639–0.799), with an optimal threshold corresponding to a sensitivity of 0.581 and a specificity of 0.803. For the external validation dataset, the AUC is 0.684 (95% CI: 0.586–0.783), and the optimal threshold point has a sensitivity of 0.562 and a specificity of 0.773. c, d Calibration Curve Analyses. c and d display the calibration curves for the model on the test dataset and the external validation dataset, respectively. The curve for the test dataset deviates from the ideal line to some extent, while the curve for the external validation dataset more closely follows the ideal line, particularly in the higher probability range, indicating better calibration performance for this dataset. e, f Model Diagnostic Performance via DCA Analysis. The DCA analysis diagrams evaluate the diagnostic efficacy of a logistic regression model on a test dataset and an external dataset. The model’s predictions are deemed diagnostically relevant in 82% of cases for the test dataset and 74% for the external dataset.