Table 4 Summary statistics for the two datasets analyzed in this paper and their corresponding corrected versions proposed.

Dataset	Brief Description	#Images	#Diagnoses	#Training Images	#Validation Images	#Testing Images
DermaMNIST	The original DermaMNIST dataset.	10,015	7	7,007	1,003	2,005
DermaMNIST-C	The “corrected” version of DermaMNIST, without any data leakage.	10,015	7	8,215	573	1,227
DermaMNIST-E	The “extended” version of DermaMNIST, without any data leakage and with more images.	11,719	7	10,015	193	1,511
Fitzpatrick17k	The original Fitzpatrick17k dataset.	16,577	114	12,751	3,826	0
Fitzpatrick17k-C	The “cleaned” version of Fitzpatrick17k, with standardized train-valid-test splits after removing duplicates and erroneous images.	11,394	114	7,975	1,139	2,280

For both Fitzpatrick17k and Fitzpatrick17k-C, the partitions correspond to the experiment titled “Random”, both in Table 2 and in Groh et al.²⁰.

Quick links

Search