Table 3 Fazekas rating performance.
Task | Agreement | Binary categorization | ||
---|---|---|---|---|
Q kappa | categories | AUROC† | AUPRC† | |
Internal validation set | ||||
Regression | 0.904 | 0 vs. 1, 2, 3 | 1.000 [1.000–1.000] | 1.000 [1.000–1.000] |
0, 1 vs. 2, 3 | 0.969 [0.956–0.981] | 0.970 [0.955–0.985] | ||
0, 1, 2 vs. 3 | 0.940 [0.918–0.961] | 0.840 [0.786–0.894] | ||
Classification | 0.917 | 0 vs. 1, 2, 3 | 0.997 [0.993–1.000] | 0.999 [0.999–0.999] |
0, 1 vs. 2, 3 | 0.975 [0.963–0.987] | 0.977 [0.964–0.990] | ||
0, 1, 2 vs. 3 | 0.957 [0.937–0.978] | 0.900 [0.862–0.937] | ||
Internal test set | ||||
Regression | 0.951 | 0 vs. 1, 2, 3 | 1.000 [1.000–1.000] | 1.000 [1.000–1.000] |
0, 1 vs. 2, 3 | 0.987 [0.975–0.999] | 0.988 [0.975–0.998] | ||
0, 1, 2 vs. 3 | 0.987 [0.973–1.000] | 0.970 [0.954–0.984] | ||
Classification | 0.956 | 0 vs. 1, 2, 3 | 1.000 [0.999–1.000] | 1.000 [1.000–1.000] |
0, 1 vs. 2, 3 | 0.991 [0.981–1.000] | 0.992 [0.984–0.998] | ||
0, 1, 2 vs. 3 | 0.991 [0.982–1.000] | 0.977 [0.962–0.991] | ||
External validation set | ||||
Regression | 0.898 | 0 vs. 1, 2, 3 | 0.957 [0.895–1.000] | 0.997 [0.994–1.000] |
0, 1 vs. 2, 3 | 0.982 [0.957–1.000] | 0.984 [0.961–1.000] | ||
0, 1, 2 vs. 3 | 1.000 [1.000–1.000] | 1.000 [1.000–1.000] | ||
Classification | 0.956 | 0 vs. 1, 2, 3 | 0.972 [0.922–1.000] | 0.998 [0.995–1.000] |
0, 1 vs. 2, 3 | 0.992 [0.978–1.000] | 0.993 [0.983–1.000] | ||
0, 1, 2 vs. 3 | 1.000 [1.000–1.000] | 1.000 [1.000–1.000] |