Table 4 Classification results (precision, recall, accuracy and F1 score) for the seven human raters vs. the best performing AI variant, EfficientNet-B5, in the respective metric. AI results were retrieved for the same subset of images the humans rated, to allow a fair comparison. Color-coding is provided from white (0.5) to green (1.0). All group comparisons were statistically significant in paired samples t-tests. * 95% confidence intervals given for the human raters.
From: Predicting biological sex in pediatric skeleton X-rays using artificial intelligence
