Table 17 Comparative analysis against relevant studies on ISIC 2018 and ISIC 2019 datasets.
From: Multimodal deep learning ensemble framework for skin cancer detection
Study | Dataset | Pre-processing | Use metadata | Classifier and training algorithm | Parameters | Accuracy (%) | Proposed model |
|---|---|---|---|---|---|---|---|
ISIC 2018 | Data augmentation, colour constancy (Shades of Gray), metadata encoding | Yes | EfficientNet-B3 & B4 with metadata fusion, TTA | SGD, OneCycle LR, weighted cross-entropy loss | 89.5% | 93.2% | |
Image: Data augmentation Metadata: Handling missing values, normalization of numerical attributes (age), balancing metadata distribution | Yes | Hybrid CNN-ViT with Focal Loss (FL) | Adam optimizer, learning rate = 0.001, batch size = 32, focal loss for class imbalance | 89.48% | |||
Contrast enhancement | No | DarkNet-53 and DenseNet-201 using transfer learning | -Epochs = 100-Learning rate = 0.0002,-Momentum = 0.6557-Batch size = 128 | 85.4% | |||
Data augmentation and resize | No | Inception ResNet v2 and EfcientNet-B4 ensemble | Adam optimizer with lr = 0.01, Epsilon = 0.1 | 88.21% | |||
Resize, data augmentation and normalization | No | ConvNext-Tiny, EfficientNetB0, SENet, DenseNet, ResNet50 ensemble | Adam optimizer with lr = 0.001-Epochs = 100-Batch size = 32 | 90.15% | |||
Image resizing | No | The DXDSENet-CM ensemble model combines Xception, DenseNet201, and a Depthwise Squeeze-and-Excitation ConvMixer for enhanced skin lesion classification | Adam optimizer with learning rate schedule ReduceLROnPlateau (factor 0.3, min_lr 1e-6),Batch size: 128 Epochs: 100 Input size: 224 × 224 Activation:GELU ReLU, Softmax | 88.21% | |||
Image resizing -Normalization | No | Federated MobileNetV2 | 4 clients total (2 trained on ISIC 2018); 7 classes | 80% | |||
ISIC 2019 | Image: Data augmentation resizingMetadata: Standardization Missing metadata (mean imputation for numerical values and mode imputation for categorical values) | Yes | Ensemble of EfficientNet models (EfficientNetB0, EfficientNet-B1, EfficientNet-B2) for image path, metadata processed through separate path and fused with image features for final classification | Adam optimizer with lr = 0.001, batch size = 32 | 74.2% (balanced accuracy) | 91.1% | |
Data augmentation, colour constancy (Shades of Gray), metadata encoding | Yes | EfficientNet-B3 & B4 with metadata fusion, TTA | SGD, OneCycle LR, weighted cross-entropy loss | 66.2% | |||
Black borders removal andreal time data augmentation | No | Ensemble of EfficientNet-B5, SE-ResNeXt-101(32 × 4d),EfficientNet-B4 andInception-ResNet-v2 | -Number of epochs over 32- Weighted Cross-Entropy Loss | 63.4% | |||
Normalization, data augmentation and cropping | No | Ensemble of DenseNet-V2,Inception-V3,InceptionResNetV2 andXception | -Adam optimizer with learning rate (initial) = 1e-3Learning rate = 1e-4-Epochs = 50 (starting from the 4th epoch-Batch size = 64 | 82.1% | |||
Image: -Metadata: Clinical data (age, sex, medical history) integrated | Yes | DenseNet-169 with MetaNet and MetaBlock modules | Adam optimizer, learning rate = 0.001, batch size = 32 | 81.4% balanced accuracy | |||
-Image resizing (100 × 100),-Non-Local Means denoising,-Data augmentation | No | Custom CNN with Sparse Dictionary Learning | Adam optimizer, batch size = 32, epochs = 100, filters (128, 256, 512, 512, 256), kernel sizes (11 × 11 → 1 × 1), ReLU and Softmax activations | 81.23% | |||
-image resizing (299 × 299)-Data augmentation -Class balancing through oversampling | No | Inception-V3 | Adam optimizer, learning rate = 0.01, dropout = 0.25, batch size = 20, epochs = 50, fivefold cross-validation | 88.63% | |||
Image resizing -Normalization | No | Federated MobileNetV2 | 4 clients total (2 trained on ISIC 2019); 8 classes | 87% |