Boosting skin cancer diagnosis accuracy with ensemble approach

Natha, Priya; Tera, Sivarama Prasad; Chinthaginjala, Ravikumar; Rab, Safia Obaidur; Narasimhulu, C. Venkata; Kim, Tae Hoon

doi:10.1038/s41598-024-84864-5

Download PDF

Article
Open access
Published: 08 January 2025

Boosting skin cancer diagnosis accuracy with ensemble approach

Priya Natha¹,
Sivarama Prasad Tera²,
Ravikumar Chinthaginjala³,
Safia Obaidur Rab⁴,
C. Venkata Narasimhulu⁵ &
…
Tae Hoon Kim⁶

Scientific Reports volume 15, Article number: 1290 (2025) Cite this article

5783 Accesses
19 Citations
1 Altmetric
Metrics details

Subjects

Abstract

Skin cancer is common and deadly, hence a correct diagnosis at an early age is essential. Effective therapy depends on precise classification of the several skin cancer forms, each with special traits. Because dermoscopy and other sophisticated imaging methods produce detailed lesion images, early detection has been enhanced. It’s still difficult to analyze the images to differentiate benign from malignant tumors, though. Better predictive modeling methods are needed since the diagnostic procedures used now frequently produce inaccurate and inconsistent results. In dermatology, Machine learning (ML) models are becoming essential for the automatic detection and classification of skin cancer lesions from image data. With the ensemble model, which mix several ML approaches to take use of their advantages and lessen their disadvantages, this work seeks to improve skin cancer predictions. We introduce a new method, the Max Voting method, for optimization of skin cancer classification. On the HAM10000 and ISIC 2018 datasets, we trained and assessed three distinct ML models: Random Forest (RF), Multi-layer Perceptron Neural Network (MLPN), and Support Vector Machine (SVM). Overall performance was increased by the combined predictions made with the Max Voting technique. Moreover, feature vectors that were optimally produced from image data by a Genetic Algorithm (GA) were given to the ML models. We demonstrate that the Max Voting method greatly improves predictive performance, reaching an accuracy of 94.70% and producing the best results for F1-measure, recall, and precision. The most dependable and robust approach turned out to be Max Voting, which combines the benefits of numerous pre-trained ML models to provide a new and efficient method for classifying skin cancer lesions.

Advanced skin cancer prediction with medical image data using MobileNetV2 deep learning and optimized techniques

Article Open access 07 August 2025

A fuzzy rank-based deep ensemble methodology for multi-class skin cancer classification

Article Open access 20 February 2025

Enhanced MobileNet for skin cancer image classification with fused spatial channel attention mechanism

Article Open access 21 November 2024

Introduction

Skin cancer, which is prevalent globally and occasionally life-threatening^1,2, poses significant challenges for healthcare providers. Effective patient treatment requires precise diagnosis and classification of various types of skin cancer. Selecting an appropriate treatment plan is crucial, as each type of skin cancer possesses distinct characteristics that complicate predictive modeling. Among the types, melanoma^3,4,5 is particularly concerning due to its tendency to metastasize more readily. Advanced imaging techniques like dermoscopy⁶ have notably enhanced the early diagnosis of skin cancers. These high-resolution imaging methods enable the detection of specific features in skin lesions that may indicate malignancy. However, the main challenge remains accurately analyzing these images to identify the specific type of skin cancer and differentiate between benign and malignant cells. The complexity of skin cancer diagnosis is compounded by the diversity of skin conditions and the limitations of current diagnostic techniques, which often lead to inconsistent and inaccurate diagnoses.

Physical examinations and biopsies, while standard, are time-consuming, invasive, and often subject to human error due to variability in clinical expertise and interpretation. Visual inspections of skin lesions can lead to inconsistent diagnoses, as subtle differences between benign and malignant lesions may not be easily discernible. Moreover, even with advanced imaging techniques like dermoscopy, the analysis of high-resolution images relies heavily on subjective interpretation. These limitations underscore the necessity for more accurate, non-invasive diagnostic tools that can enhance early detection and reduce the likelihood of misdiagnosis, ultimately improving patient outcomes.

Recently, numerous machine learning methods have been explored for applications across various fields^7,8,9,10,11. Machine learning (ML) has the potential to revolutionize dermatology by addressing many of the challenges faced in traditional skin cancer diagnosis^{12,13,14,15,16}. ML models can analyze large datasets of dermoscopic images more rapidly and accurately than human clinicians, providing consistent, objective assessments free from human bias. Furthermore, ML models are capable of identifying subtle patterns and features within images that may be undetectable to the human eye, offering the potential for earlier detection of malignancies like melanoma, where early diagnosis significantly improves survival rates. Additionally, ML-based tools can be deployed in remote areas or regions with limited access to specialized dermatologists, enhancing the reach of expert diagnostic capabilities through telemedicine. As these models continue to improve with more extensive training data, they will provide robust decision-support systems that can assist dermatologists in formulating personalized treatment plans, improving patient outcomes, and streamlining clinical workflows.

While existing ML models, such as SVMs, Convolutional Neural Networks (CNNs), and RFs, have shown promise, they are often limited by their reliance on specific datasets and their inability to generalize well across different skin lesion types. Moreover, many of these models are confined to binary classification tasks, focusing solely on distinguishing between melanoma and non-melanoma lesions. Our proposed approach introduces a novel Max Voting ensemble method that combines the strengths of multiple pre-trained models, including RF, MLPN, and SVM, to achieve multi-class classification. Additionally, the integration of a Genetic Algorithm (GA) for optimized feature extraction further enhances the model’s performance by selecting the most relevant image features. This dual approach of ensemble learning with optimal feature extraction significantly improves predictive accuracy, robustness, and generalizability across different datasets.

The main concerns motivating this research are multiple. First of all, there is a great deal of variability in diagnosis since subjectivity and human mistake sometimes lead to incorrect conclusions from present diagnostic techniques. This unpredictability can result in wrong diagnosis and unsuitable course of therapy. Furthermore, a large number of the current ML models are limited to binary categorization. This drawback makes them less useful in actual situations when it is necessary to identify several kinds of skin abnormalities. Finally, overfitting might result from the significant dependence on certain datasets for both training and testing. This dependence limits the model’s usefulness in more extensive clinical applications by reducing its capacity to generalize to fresh and varied data.

This study is grounded on the following hypotheses: First, we anticipate that ensemble learning models will outperform individual machine learning models in terms of prediction resilience and accuracy. Implementing a standardized assessment system is expected to enhance the consistency and comparability of results across various studies and datasets. Finally, we hypothesize that expanding the range of classified skin lesion types will increase the applicability of the machine learning model in clinical settings.

This study primarily seeks to establish a reliable and precise ML framework aimed at classifying multiple types of skin cancer from dermoscopic images. By leveraging an ensemble learning approach, combined with a Genetic Algorithm for feature optimization, this study aims to significantly improve diagnostic accuracy, reduce variability in diagnoses, and ultimately support early and reliable detection of various skin cancer types, thus improving patient outcomes.

Our Contributions include:

This work aims to investigate a unique use of the Max Voting approach to enhance the precision and dependability of skin cancer lesion classification.
By using the advantages of several pre-trained ML models, such as RF, MLPN, and SVM, the suggested approach shows improved accuracy and resilience.
A Genetic Algorithm (GA) was used to produce optimal feature vectors from a collection of images. Then, increasingly sophisticated ML classification methods made use of these vectors.
We combined modern ML techniques with the field of healthcare to create a novel and effective approach to categorizing skin cancer lesions.

This work is arranged as follows: Review of the literature is found in section “Literature review”. Section “Proposed method” describes in detail the suggested method. The simulation results and an extensive discussion are presented in section “Results and its discussion”. A summary and some concluding views are included at the end of section “Conclusion”.

Literature review

ML has proven to be instrumental in automating skin cancer diagnosis, offering substantial improvements over traditional diagnostic approaches. This review explores key machine learning approaches applied in skin cancer diagnosis, and current limitations, while setting up the need for a novel ensemble approach combining Genetic Algorithm (GA)-optimized feature selection with Max Voting.

Machine learning approaches for skin cancer diagnosis

Support vector machines (SVMs)

SVMs are known for constructing optimal hyperplanes that maximize the margin between classes. SVMs are particularly effective in handling small datasets and generalizing well, making them suitable for medical image analysis where labeled data is often limited. Studies have shown that SVMs achieve strong performance in binary classification tasks for skin lesions, distinguishing malignant melanoma from benign lesions with high accuracy^17,18.

Esteva et al. (2017) demonstrated that SVMs can perform comparably to deep learning models when trained on carefully curated features, making them effective in resource-constrained environments¹⁷. Arora et al. (2022) presents a method for skin cancer diagnosis that combines the Bag of Features (BoF) model with a SVM classifier. The BoF model is used to extract distinctive visual features from skin lesion images, such as texture and color patterns, which are then classified using SVM-a powerful algorithm for binary classification¹⁸. Such results emphasize that, despite the rise of more complex models, SVMs remain highly relevant for applications where simpler models with fewer data requirements are preferable.

Kalpana et al. (2023) contributes a promising approach for early skin disease diagnosis, emphasizing efficiency, accuracy, and adaptability. By combining SVM and RF in an optimized ensemble, the OESV-KRF model highlights a pathway for developing more effective diagnostic tools that are both powerful and computationally manageable, making it suitable for real-time and resource-limited applications¹⁹.

However, SVMs face certain challenges when dealing with large, imbalanced datasets. Specifically, the prevalence of benign cases often overshadows malignant samples, leading to bias in the classification model. To counter this, methods such as Synthetic Minority Over-sampling Technique (SMOTE) have been used alongside SVMs to balance class distributions, improving the detection sensitivity for malignant lesions²⁰. This integration ensures that SVMs can effectively address the imbalance issues. This study incorporates SVMs within an ensemble framework, combining them with RF and MLPN to enhance robustness and accuracy in skin cancer classification.

Random forest (RF)

Random Forest is a commonly utilized ensemble learning technique that integrates multiple decision trees to enhance classification performance and ensure model robustness. A key benefit of RF in medical image analysis is its capacity to process high-dimensional datasets effectively and reduce overfitting by aggregating several weaker models (decision trees). In the context of skin cancer diagnosis, RF has demonstrated high efficacy in distinguishing between benign and malignant lesions. Mahbod et al. (2019) employed RF in combination with other models to classify melanoma and non-melanoma skin lesions, achieving a comparable accuracy to deep learning models but with reduced computational requirements. RF is especially suitable for datasets where features are highly varied, as it can assess feature importance and optimize decision-making accordingly. Studies have shown that RF performs reliably even with relatively smaller datasets, making it a practical choice in scenarios where large labeled datasets are scarce²¹.

However, RF models are known to struggle in multi-class classification settings, particularly when classes are imbalanced. As a result, RF is often integrated into ensemble frameworks with other classifiers, such as SVMs or CNNs, to enhance its predictive power in complex classification tasks²². By incorporating RF within an ensemble that leverages a GA for feature selection, this study aims to capitalize on RF’s strengths while addressing its limitations in a multi-class skin lesion classification context.

Multi-layer perceptron neural network (MLPN)

A MLPN is a type of feed-forward neural architecture that is composed of several neuron layers, including input, hidden, and output layers. MLPNs excel at capturing complex, non-linear interactions among features, which makes them particularly useful for medical image analysis where patterns are often intricate²³. In skin cancer diagnosis, MLPN has been used both as a standalone classifier and within ensemble frameworks. However, MLPNs often require extensive parameter tuning and sufficient data to achieve optimal performance, which can limit their effectiveness on smaller datasets²⁴. The limitation of MLPNs in handling high-dimensional, imbalanced datasets has led researchers to explore combining MLPN with other ML methods, such as SVM or decision trees, to improve classification accuracy and robustness.

Additionally, MLPNs are prone to overfitting, especially in cases with limited training data. To address this, ensemble methods incorporating MLPN have shown effectiveness in increasing generalizability, particularly when paired with complementary classifiers like RF and SVM. In ensemble methods, such as the one proposed in this study, MLPN can contribute its ability to capture complex feature interactions while being supported by other models like RF and SVM to provide a more generalized solution. By integrating MLPN within an ensemble framework optimized by a GA for feature selection, this study leverages MLPN’s non-linear learning capabilities to enhance overall performance in skin cancer classification.

Ensemble learning in skin cancer classification

Ensemble learning has become a widely adopted approach in skin cancer classification due to its ability to combine the strengths of multiple models, resulting in improved accuracy and robustness. Ensemble methods such as bagging, boosting, stacking, and Max Voting are commonly applied in skin cancer diagnosis, each offering distinct benefits for handling complex image data.

Bagging and random forests Bagging involves training multiple models on different subsets of the data and averaging their predictions. RF, a form of bagging with decision trees as base learners, has been widely used in skin cancer classification. Goyal et al. (2019) demonstrated that RF-based ensemble learning achieved high classification accuracy on melanoma datasets, benefiting from the reduced variance inherent in bagging²⁵. Additionally, Dhivyaa et al. (2020) found that RF models could effectively differentiate between melanoma and benign lesions by capturing complex patterns in dermoscopic images²².

Boosting techniques Boosting combines weak learners sequentially, with each new model focusing on correcting errors made by previous models. Popular boosting algorithms such as AdaBoost and Gradient Boosting have shown promising results in skin cancer classification. For example, Gamil et al. (2024) used AdaBoost to classify dermoscopic images, reporting improvements in sensitivity and specificity by iteratively focusing on difficult cases²⁶. Chang et al. (2022) applied Gradient Boosting in combination with deep features to detect melanoma, achieving high accuracy and specificity, which underscores the potential of boosting methods to enhance classification in medical imaging²⁷. Boosting techniques are particularly effective when dealing with imbalanced datasets, which are common in medical applications where malignant cases are often less frequent.

Stacking ensembles Stacking, or stacked generalization, involves training multiple classifiers and then combining their outputs with a meta-classifier, which learns to make predictions based on the strengths of each base classifier. This technique has been explored in skin cancer diagnosis due to its capacity to leverage multiple model architectures effectively. Shorfuzzaman et al. (2022) developed a stacking ensemble combining CNN models, achieving improved classification performance on skin cancer images²⁸. By incorporating both deep learning and traditional machine learning models, stacking ensembles can capture both high-level patterns and precise boundaries, offering enhanced classification accuracy.

Max Voting and hybrid ensembles Max Voting is one of the simplest ensemble methods, where each classifier “votes” on the predicted class, and the class with the majority of votes is selected. Although straightforward, Max Voting has proven effective in various skin cancer classification studies. Bhowmik et al. (2019) implemented a Max Voting ensemble combining learning models of IG-ResNeXt-101, SWSL-ResNeXt-101, ECA-ResNet-101, and DPN-131 achieving higher accuracy in melanoma detection than any individual model²⁹. Max Voting is often applied in hybrid ensembles, which combine multiple machine learning techniques to improve classification accuracy.

Genetic Algorithm

Feature selection is a critical step, especially when working with high-dimensional image data, as it helps reduce overfitting, enhances model generalization, and improves computational efficiency^30,31. In this study, we employ a GA to identify the most relevant subset of features from the dermoscopic images, ensuring optimal model performance. The GA is a bio-inspired optimization technique based on the principles of natural selection and genetics, which efficiently searches the feature space for the best combination of features. The GA operates through a series of evolutionary steps, which involve initializing a population of potential feature subsets (chromosomes), evolving these subsets through selection, crossover, and mutation, and iteratively refining the population until an optimal solution is found.

Feature extraction techniques

Feature extraction plays a critical role in skin cancer image analysis by transforming raw image data into meaningful representations that can be used by machine learning models. Two primary categories of features are commonly extracted from dermoscopic images: color features and texture features³².

Color feature extraction

Color features are essential for distinguishing between different types of skin lesions, as malignant and benign lesions often exhibit varying color patterns. Color histograms and color moments are frequently employed to capture the distribution and intensity of colors within an image³³. Color histograms measure the proportion of pixels within predefined color bins, allowing for the identification of dominant colors in a lesion. Meanwhile, color moments-such as the mean, standard deviation, and skewness-provide a summary of the lesion’s color distribution. These color features are typically extracted from different color spaces, such as RGB, HSV, and LAB, to improve the robustness of the analysis³⁴.

Texture feature extraction

Texture features offer additional diagnostic insights by analyzing the surface patterns and irregularities of skin lesions. One of the most common texture analysis techniques is the Gray-Level Co-occurrence Matrix (GLCM), which quantifies the spatial relationships between pixel intensities³⁵. GLCM-based features such as contrast, energy, homogeneity, and entropy are widely used to capture the texture of skin lesions. Texture-based features are critical for distinguishing between benign lesions with smooth surfaces and malignant lesions that tend to exhibit rough, irregular textures³⁶.

While existing ML techniques such as CNNs³⁷, SVMs, and ensemble learning have shown promise in skin cancer diagnosis, they come with several limitations. CNNs, for instance, are highly data-intensive and require large, well-labeled datasets to achieve optimal performance, making them less practical in scenarios where data is scarce or imbalanced. SVMs, though effective in binary classification tasks, struggle with multi-class classification and often require significant tuning of hyperparameters, which can be computationally expensive. Ensemble methods, while improving predictive accuracy through model aggregation, are prone to overfitting, particularly when trained on limited or biased datasets. Additionally, many models do not generalize well across diverse datasets, limiting their applicability in real-world clinical settings. These challenges highlight the need for an approach that not only combines the strengths of multiple models but also optimizes feature selection to enhance generalizability, reduce overfitting, and improve the classification. Our proposed Max Voting ensemble method, paired with a Genetic Algorithm for feature optimization, seeks to address these limitations.

Table 1 Summary of key related works.

Subjects

Abstract

Similar content being viewed by others

Introduction

Literature review

Machine learning approaches for skin cancer diagnosis

Support vector machines (SVMs)

Random forest (RF)

Multi-layer perceptron neural network (MLPN)

Ensemble learning in skin cancer classification

Genetic Algorithm

Feature extraction techniques

Color feature extraction

Texture feature extraction

Proposed method

RF, SVM, and MLPN: ensemble selection rationale

GA for optimizing feature selection: rationale

Max Voting mechanism: rationale

Abstract model perspective of the proposed approach

In-depth analysis of the proposed model

DATASET

Image preprocessing

Feature extraction

Feature selection optimization-Genetic Algorithm (GA)

Machine learning models

Random forest (RF)

Support vector machine (SVM)

Multi-layer perceptron neural network (MLPN)

Max Voting mechanism

Results and its discussion

Experimental setup

Features extraction analysis

Impact of GA on model performance

Evaluation parameters

Error analysis and confusion matrix (CM)

Performance of multiple models on the HAM10000 dataset

Performance of multiple models on the ISIC 2018 dataset

Comparison of state-of-the-art models versus proposed Max Voting model

Use of geometric-mean (GM)

Comparative analysis

Multi-criteria decision analysis (MCDA)

Conclusion

Future work

Data availability

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding authors

Ethics declarations

Competing interests

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

This article is cited by

Search

Quick links