Table 4 Inter-reader reliability of the models and radiologists
From: Deep learning models in classifying primary bone tumors and bone infections based on radiographs
Inter-reader reliability between the ensemble model and radiologists | ||||||
---|---|---|---|---|---|---|
Fleiss κ (95% CI) | 0.501 (0.463–0.538) | |||||
Cohen κ (95% CI) | Expert 1 | CSTC | Expert 2 | CSTC | Expert 3 | CSTC |
Ensemble model | 0.299 (0.265–0.333) | + + | 0.493 (0.456–0.531) | + + + | 0.456 (0.419–0.493) | + + + |
Cohen κ (95% CI) | Expert 3 | CSTC | Expert 4 | CSTC | Expert 6 | CSTC |
Ensemble model | 0.356 (0.321–0.392) | + + + | 0.570 (0.532–0.607) | + + + | 0.596 (0.560–0.633) | + + + |
Inter-reader reliability among radiologists | ||||||
Fleiss κ (95% CI) | 0.401 (0.364–0.438) | |||||
Cohen κ (95% CI) | EG1 | CSTC | EG2 | CSTC | EG3 | CSTC |
0.267 (0.234–0.300) | + + | 0.295 (0.261–0.329) | + + | 0.581 (0.544–0.618) | + + + |
Inter-reader reliability among models | ||||||||
---|---|---|---|---|---|---|---|---|
Fleiss κ (95% CI) | 0.800 (0.770–0.830) | |||||||
Cohen κ (95% CI) | E3 | CSTC | E4 | CSTC | ViT | CSTC | SWIN | CSTC |
Ensemble model | 0.805 (0.775–0.835) | + + + + + | 0.793 (0.763–0.823) | + + + + | 0.783 (0.752–0.814) | + + + + | 0.908 (0.886–0.930) | + + + + + |