Fig. 3: Comparison of the test accuracy distributions obtained with the four evaluated classifiers.

The mean (μ) and standard deviation (σ) of the accuracies are reported above every box. ResNet with transfer learning (ResNet-TL) delivers the best performance (μ = 88.9%), followed by a ResNet trained from scratch (μ = 87.5%), the Z-Score baseline (μ = 83.7%), and the Random-Forest baseline (μ = 77.6%), which also shows the largest variance.