Table 1 Performance of the uncalibrated base (Deeplasia) and the Georgia-specific calibrated version (Deeplasia-GE) on the test set of the Georgian bone age dataset. Previous results for the performance in the RSNA, DHA, and GDBD datasets5 are provided as a reference. DHA: Los Angeles digital hand atlas, GDBD: German dysplastic bone dataset. MAD: mean absolute difference, RMSE: root mean squared error, RSNA: radiological society of North america. Lower MAD and RMSE indicate higher accuracy. bEstimated range for the accuracies of the assessed single raters.
From: Population-specific calibration and validation of an open-source bone age AI
Dataset | No. Ref. Ratings | n | Deeplasia (months) | Inter-rater (months) | ||
---|---|---|---|---|---|---|
MAD | RMSE | MAD | RMSE | |||
Georgian | 7 | 260 | 6.6 (base) | 8.8 ([8.1, 9.6]) (base) | 7.9 | 10.6 |
Georgian | 7 | 260 | 5.7 (calibrated) | 7.4 ([6.8, 8.1]) (calibrated) | ||
RSNA11 | 6 | 200 | 3.9 | 5.1 (4.7, 5.7]) | 4.8–7.0b | - |
DHA23 | 2 | 1383 | 5.8 | 7.7 ([7.4, 8.0]) | 4.4 | 7.0 |
GDBD5 | 2 | 702 | 6.0 | 7.7 ([7.3, 8.1]) | 9.5 | 12.8 |