Table 4 Summary statistics for the two datasets analyzed in this paper and their corresponding corrected versions proposed.
From: Investigating the Quality of DermaMNIST and Fitzpatrick17k Dermatological Image Datasets
Dataset | Brief Description | #Images | #Diagnoses | #Training Images | #Validation Images | #Testing Images |
---|---|---|---|---|---|---|
DermaMNIST | The original DermaMNIST dataset. | 10,015 | 7 | 7,007 | 1,003 | 2,005 |
DermaMNIST-C | The “corrected” version of DermaMNIST, without any data leakage. | 10,015 | 7 | 8,215 | 573 | 1,227 |
DermaMNIST-E | The “extended” version of DermaMNIST, without any data leakage and with more images. | 11,719 | 7 | 10,015 | 193 | 1,511 |
Fitzpatrick17k | The original Fitzpatrick17k dataset. | 16,577 | 114 | 12,751 | 3,826 | 0 |
Fitzpatrick17k-C | The “cleaned” version of Fitzpatrick17k, with standardized train-valid-test splits after removing duplicates and erroneous images. | 11,394 | 114 | 7,975 | 1,139 | 2,280 |