Table 3 Understanding how Fitzpatrick17k-C classification performance varies with change in hyperparameters.

From: Investigating the Quality of DermaMNIST and Fitzpatrick17k Dermatological Image Datasets

Hyperparameters optimized on  → 

Verified

Random (Stratified)

Source A (Atl. Derm.)

Source B (DermaAmin)

FST 3–6

FST 1–2 & 5–6

FST 1–4

Best Hyperparameters (n_epochs, optim, lr)

(200, Adam, 1e-3)

(200, Adam, 1e-4)

(100, Adam, 1e-4)

(200, SGD, 1e-2)

(100, Adam, 1e-4)

(200, Adam, 1e-4)

(200, Adam, 1e-4)

Holdout set performance (overall test accuracy)

Verified

4.65%  ± 0.00%

4.34%  ± 0.22%

4.50%  ± 0.22%

3.88%  ± 0.96%

4.50%  ± 0.22%

4.34%  ± 0.22%

4.34%  ± 0.22%

Random Holdout

20.20%  ± 0.32%

24.05%  ± 0.34%

23.23%  ± 0.43%

19.25%  ± 0.57%

23.23%  ± 0.43%

24.05%  ± 0.34%

24.05%  ± 0.34%

Source A

14.96%  ± 0.77%

16.38%  ± 0.77%

16.62%  ± 0.86%

14.28%  ± 0.30%

16.62%  ± 0.86%

16.38%  ± 0.77%

16.38%  ± 0.77%

Source B

4.78%  ± 0.20%

4.50%  ± 0.32%

4.50%  ± 0.32%

5.16%  ± 0.20%

4.50%  ± 0.32%

4.50%  ± 0.32%

4.50%  ± 0.32%

FST 3–6

12.65%  ± 0.23%

14.36%  ± 0.15%

14.42%  ± 0.13%

11.80%  ± 0.69%

14.42%  ± 0.13%

14.36%  ± 0.15%

14.36%  ± 0.15%

FST 1–2 & 5–6

14.56%  ± 0.53%

17.27%  ± 0.38%

16.99%  ± 0.33%

13.78%  ± 0.41%

16.99%  ± 0.33%

17.27%  ± 0.38%

17.27%  ± 0.38%

FST 1–4

11.34%  ± 0.66%

12.07%  ± 0.28%

11.92%  ± 0.38%

8.08%  ± 4.58%

11.92%  ± 0.38%

12.07%  ± 0.28%

12.07%  ± 0.28%

  1. The columns denote the optimal hyperparameters for each of the seven experimental settings, and the rows represent the overall test accuracies for all the seven settings when models trained with those particular hyperparameters are evaluated. For example, when a model trained with “Source A”’s optimal hyperparameters (i.e., 100 training epochs, Adam optimizer, 1e-4 learning rate) is used to generate predictions for the “Verified” setting, the overall test accuracy is 4.50% ± 0.22%.