Table 2 Diagnostic performance of the AI model compared with physicians with varying experience (range from 3 to >10 years) in CMR reading
No. of subjects (n = 500) | F1 score | |||||
|---|---|---|---|---|---|---|
AI model | Physician (3–5 years) | Physician (5–10 years) | Physician (>10 years) | |||
1 | HCM | 100 | 0.971 | 0.957 | 0.938 | 0.962 |
2 | DCM | 100 | 0.914 | 0.853 | 0.911 | 0.940 |
3 | CAD | 80 | 0.962 | 0.916 | 0.949 | 0.969 |
4 | LVNC | 30 | 0.877 | 0.667 | 0.778 | 0.885 |
5 | RCM | 30 | 0.933 | 0.578 | 0.760 | 0.800 |
6 | CAM | 30 | 0.947 | 0.667 | 0.931 | 0.931 |
7 | HHD | 30 | 0.833 | 0.615 | 0.667 | 0.896 |
8 | Myocarditis | 20 | 0.857 | 0.553 | 0.600 | 0.683 |
9 | ARVC | 30 | 0.897 | 0.451 | 0.814 | 0.983 |
10 | PAH | 30 | 0.983 | 0.061 | 0.929 | 0.931 |
11 | Ebstein’s anomaly | 20 | 0.950 | 0.519 | 0.842 | 0.974 |
Frequency-weighted F1 | 0.931 | 0.734 | 0.872 | 0.927 | ||
Accuracy | 0.932 | 0.746 | 0.868 | 0.928 | ||
Time cost (in total) | 1.94 min | 576 min | 329 min | 418 min | ||