Table 2 Performance of individual raters. The performance of Deeplasia and Deeplasia-GE is compared against each of the seven individual raters to the consensus bone ages established by the remaining six raters. Metrics where the automatic bone age assessment is more accurate than the manual assessment are marked in bold.
From: Population-specific calibration and validation of an open-source bone age AI
MAD | RMSE | |||||
---|---|---|---|---|---|---|
left out rater | manual | Deeplasia | Deeplasia-GE | manual | Deeplasia | Deeplasia-GE |
I | 6.4 | 7.0 | 6.0 | 8.4 [7.8, 9.2] | 9.2 [8.5, 10.1] | 7.8 [7.2, 8.5] |
II | 7.3 | 7.0 | 6.1 | 9.5 [8.7, 10.4] | 9.2 [8.5, 10.1] | 7.9 [7.3, 8.6] |
III | 9.7 | 6.6 | 5.6 | 12.7 [11.7, 13.9] | 8.7 [8.0, 9.5] | 7.3 [6.7, 8.0] |
IV | 8.6 | 6.3 | 5.4 | 11.2 [10.3, 12.2] | 8.5 [7.8, 9.3] | 7.1 [6.5, 7.8] |
V | 6.7 | 6.8 | 5.8 | 8.9 [8.2, 9.7] | 9.0 [8.3, 9.8] | 7.4 [6.9, 8.1] |
VI | 7.4 | 6.5 | 5.8 | 9.8 [9.0, 10.7] | 8.6 [7.9, 9.4] | 7.4 [6.8, 8.1] |
VII | 9.3 | 7.4 | 6.6 | 12.0 [11.1, 13.1] | 9.8 [9.1, 10.8] | 8.4 [7.7, 9.2] |