Table 2 Table of dataset details.

From: A hybrid quantum–classical convolutional neural network with a quantum attention mechanism for skin cancer

Dataset

Full name & URL

Image type & resolution

Total samples used

Classes (count)

Train/test split

Class distribution & imbalance handling

MNIST

MNIST Handwritten Digit Database,

http://yann.lecun.com/exdb/mnist/

Grayscale, $28\times 28$

70,000 (60,000 train; 10,000 test)

10 digits (0–9)

80/20

Balanced across 10 classes; no special handling required

CIFAR-10

CIFAR-10 Object Recognition Dataset,

https://www.cs.toronto.edu/~kriz/cifar.html

RGB, $32\times 32$

60,000 (50,000 train; 10,000 test)

10 object categories (aeroplane, automobile, bird, cat, deer, dog, frog, horse, ship, truck)

80/20

Balanced across 10 classes; no special handling required

Skin Cancer (Binary)

Skin Cancer: Malignant vs. Benign (Kaggle),

https://www.kaggle.com/datasets/fanconic/skin-cancer-malignant-vs-benign

RGB dermoscopy, up to $1024\times 1024 \rightarrow$ resized to $150\times 150$

3297 (1800 benign; 1497 malignant)

2 classes:—Benign (1800);—Malignant (1497)

80/20

Imbalanced (malignant $\approx$45%); addressed via:—Weighted cross-entropy loss (higher weight for malignant class);—Augmentation: rotation, flips, zoom to oversample malignant cases

HAM10000

HAM10000 (“Human Against Machine with 10,000 training images”),

https://dataverse.harvard.edu/dataset.xhtml?

persistentId = https://doi.org/10.7910/DVN/DBW86T

RGB dermatoscopic images, typically $600\times 450$ pixels

10,015 images

7 classes of pigmented lesions: Melanocytic nevi (NV), Melanoma (MEL), Benign keratosis-like lesions (BKL), Basal cell carcinoma (BCC), Actinic keratoses (AKIEC), Vascular lesions (VASC), Dermatofibroma (DF)

No official split; commonly used splits vary (e.g., 80/20 or 90/10 for train/test)

Highly imbalanced: Melanocytic Nevi (NV) comprises $ > 60\%$ % of samples. Melanoma (MEL) is $\approx 10\%$. Handled in studies using: Focal loss, oversampling, and weighted approaches