Abstract
Cancer is among the most dangerous diseases contributing to rising global mortality rates. Lung cancer, particularly adenocarcinoma, is one of the deadliest forms and severely impacts human life. Early diagnosis and appropriate treatment significantly increase patient survival rates. Computed Tomography (CT) is a preferred imaging modality for detecting lung cancer, as it offers detailed visualization of tumor structure and growth. With the advancement of deep learning, the automated identification of lung cancer from CT images has become increasingly effective. This study proposes a novel lung cancer detection framework using a Flower Pollination Algorithm (FPA)-based weighted ensemble of three high-performing pretrained Convolutional Neural Networks (CNNs): VGG16, ResNet101V2, and InceptionV3. Unlike traditional ensemble approaches that assign static or equal weights, the FPA adaptively optimizes the contribution of each CNN based on validation performance. This dynamic weighting significantly enhances diagnostic accuracy. The proposed FPA-based ensemble achieved an impressive accuracy of 98.2%, precision of 98.4%, recall of 98.6%, and an F1 score of 0.985 on the test dataset. In comparison, the best individual CNN (VGG16) achieved 94.6% accuracy, highlighting the superiority of the ensemble approach. These results confirm the model’s effectiveness in accurate and reliable cancer diagnosis. The proposed study demonstrates the potential of deep learning and neural networks to transform cancer diagnosis, helping early detection and improving treatment outcomes.
Similar content being viewed by others
Introduction
Lung cancer or Adenocarcinoma is known a very lethal form of cancer. The rate of mortality for lung cancer is also very high. Only about 45 in 100 people that is 45 percent survive the lung cancer for 1 year or more1,2,3. Furthermore, only about 20 out of 100 people that is only 20 percent will only be able to survive for 5 or more years and only about 10 out of 100 people that is only a mere 10 percent of the people affected by lung cancer2,3. Various scientists across the globe are working tirelessly for decades to find the cure and vaccination for this disease but the progress in the research to search for the cure and vaccination for the lung cancer has been quite slow the only way to treat this disease is to caught this disease at a nascent stage, if caught at a later stage it is not easy to cure and rate of survival is quite low4,5.This study aims to identify an effective method for detecting lung cancer in CT scan images using convolutional neural networks and other deep learning models6,7,8. We do this by taking the help of different algorithms like CNN, VGG16, InceptionV3, ResNet101V2, MobileNet, Xception and DenseNet CNN Model Frameworks. We perform the task by detection of fatal lung nodules at an earliest by checking its probability9,10,11. Medical practitioners perform various diagnostic procedures, including clinical assessments, CT scan analysis, positron emission tomography (PET), and needle prick biopsy analysis. But use of these type of evasive methods involves a lot of high risks and anxiety issues. CT imaging is regarded as the most suitable technique for detecting lung cancer. In this we prefer low dose CT because it takes lower frequency radiation than common dose CT12. Also result show cancer related deaths were lower in persons who were exposed to low dose Ct than those who were exposed to chest radiographs.In this approach, images are divided into thinner slices, enabling better detail. Also, the images with slice larger than 2.5 mm were abandoned. This led to a total of 888 CT scans images, with a total of 36,378 annotation by radiologists13. Only the annotation greater than or equal to 3mm were considered relevant. The nodules observed by differed bibliophiles which were far nearer than the sum of their radii was also combined which led to the average of these merged annotations Lung cancer is one of the most difficult challenges in the field of oncology and a significant contributor to global deaths. An early and precise recognition of lung cancer is very much needed for an effective treatment and better outcomes. Among the various diagnostic tools available, computed tomography (CT) has become a pioneer in the detection and treatment of lung cancer. Nonetheless, manually interpreting CT images remains challenging and susceptible to variability, highlighting the necessity for a reliable, automated, and more efficient approach to initially detect lung cancer.
In the past few years, advancements in deep learning have fundamentally transformed medical image analysis, yielding promising outcomes for enhanced diagnostic precision. This study aims to use the capabilities of deep learning combined with the optimization and fuzzy image enhancement strategies to create an efficient method for early detection using CT images9. Lung cancer detection comes with challenges, including the subtle and heterogeneous nature of lesions and the anatomical variation between different patients and the presence of noise and artifacts in CT scans. Different traditional techniques often rely on manual interpretation by radiologists, which is very inefficient and also leads to faults. In addition, the vast volume of imaging data generated in clinical settings often requires efficient and scalable solutions that meet diagnostic requirements4,13. Convolutional Neural Networks (CNNs) are adept at extracting layered features from unprocessed pixel data, rendering them highly effective for image analysis applications. By training CNNs on annotated CT scan images, we can take advantage of their ability to automatically extract different discriminative features that are indicative of lung cancer. However, indiscriminate application of CNNs to entire CT volumes can lead to suboptimal results due to the presence of non-cancerous structure. To eliminate this problem, we can use image segmentation techniques to delineate regions of interest in the lung, effectively guiding CNN’s focus on various suspicious lesions1,10. The segmentation process involves semantically dividing the CT images into significant regions such as the lung parenchyma, blood vessels, and lesions. Depending on the complexity of this task, different segmentation algorithms can be used, including thresholding, region growth, and deep learning-based methods. Once regions of interest are identified, CNNs are applied to different feature extractions and classification of potential cancerous lesions is performed. Recent advancements in transformer-based models have significantly improved medical image segmentation. For instance,14 employs a mixed-transformer with semantic segmentation and triplet pre-processing to fuse MRI and PET data for early multi-class Alzheimer’s diagnosis. Similarly, XAI-RACapsNet15 integrates an explainable capsule network with O-net ROI segmentation to enhance breast cancer detection in mammography images. Additionally,16 leverages a deep dual patch attention mechanism and adversarial learning for accurate epileptic seizure prediction. These studies highlight the growing impact of transformer-based and attention-driven models in achieving more accurate, interpretable, and disease-specific segmentation results. This synergistic approach not only increases the sensitivity and specificity of lung cancer detection, but also reduces false positives and helps improve computational efficiency3,17. In this paper, we try to present a comprehensive framework for deep learning censor detection that includes both theoretical foundations and practical implementation. We begin our research by reviewing existing methodologies and challenges in lung cancer detection by laying the groundwork of our proposed approach. Subsequently, we will dive into the theoretical foundations of deep learning and clarify the principles of CNN and their application in medical image analysis2,18. We then also describe our methodology which we used for integrating segmentation and CNNs for lung cancer detection, considering technical nuances and design considerations. In addition, we present results demonstrating the efficiency and robustness of our approach, which we have validated on a diverse CT image dataset. Through this research, we try to advance the state-of-the-art in lung cancer detection and try to offer a scalable and reliable solution for clinical practice, by leveraging the synergy between deep learning and image segmentation6, our objective is to provide healthcare professionals with tools that facilitate early diagnosis and personalize treatment for each patient individually, ultimately improving patient health.
In this study, we propose an innovative deep learning framework. Our ensemble learning approach involves evaluating different pre-trained CNN models for their feature extraction capabilities, along with employing optimization techniques to calculate optimal weights for classifying lung cancer. The main contributions include the following.
-
Six pre-trained convolutional neural network (CNN) architectures are modified and fine-tuned through transfer learning techniques, utilizing a publicly available lung cancer dataset. These CNN variants, which feature various architectural components, have been evaluated as Base classifiers.
-
The top three CNN models in terms of performance are selected to develop an ensemble model using a weighted averaging approach, ensuring robustness, stability, and improved classification performance. The combination of multiple CNN models helped reduce individual model errors.
-
A weighted ensemble learning approach is introduced to enhance the performance of the selected CNN models.
-
Various optimization algorithms, including the Flower Pollination Algorithm (FPA), Artificial Bee Colony (ABC), Particle Swarm Optimization (PSO), Bayseian Optimization(BO) and Ant Colony Optimization(ACO), are evaluated for optimizing ensemble weights. Based on the performance results, FPA is selected for weight optimization, leading to the development of the proposed FPA based weighted ensemble model.
-
An experimental comparison of various CNN models with the ensemble model has been conducted to assess the effectiveness of the proposed methodology. The proposed model exhibited a notable enhancement over the top-performing individual CNN model, affirming its efficacy.
-
The Proposed Model has demonstrated improved overall performance, an optimal balance between precision and recall, and enhanced generalization capability for classifying lung cancer.
The rest of the paper is organized as follows. In the ’Related Work’ section of this paper, the review of the literature is discussed. Methods and Materials Covers the methodology, including the data set used, the CNN models used, the ensemble approach, and the optimization technique (FPA) used to find the optimal weights for the ensemble approach. The proposed model discusses the proposed methodology. The Experimental Results and Discussion section discusses the platform and evaluation metrics used to conduct and evaluate experiments with Results and Discussion. The last section is the Conclusion.
Related work
The Lung cancer or adenocarcinoma was first properly explained by doctors in the mid-19th century, but still in the early half of the twentieth century it was quite rare. But as industries came up, the more pollution in the air was present, leading to increasing cases of lung cancer. In addition, extensive smoking leads to a high risk of lung cancer19,20. The first computer aided detection (CAD) was developed in late 1980s for the lung modules but it was not very effective in detection of lung cancer. Later, new technologies such as the graphical unit and the convolutional neural network were combined for the detection of lung cancer21. Tan et al.22 presents the review of the technique for the semantic medical image segmentation. Many prominent scientists have worked and are still working to find the cure and vaccination of lung cancer and the way to detect lung cancer at a nascent stage so that we can catch it at an early stage and prevent it from further expanding and can save the life of the patient. Setio et al. suggested a fully 3D convolutional neural network classification for decrease in FP in lung module classification. 3D classification is used to fully understand CT scans and reduce the probability of making a wrong decision23. Xu et al.24 proposed image fusion technique using dual gain video stream. Ding and Liao et al. used a 3D faster R-CNN to decrease false positive in nodule detection. This is used to speed up the process of nodule classification and find the dual path network to prevent wrong analysis of the lung nodule. Hammad et al.25 proposed myocardial infraction detection model. Jiang Hongyang is also one of the most prominent scientists who have worked in cancer detection to find a way to detect early lung cancer26. He created a group based on pulmonary nodule detection to use multiple techniques with the Frangi filter to improve performance19. It was used to combine two sets of images and use a four-channel 3D CNN to learn the features collected by the radiologists. These are just a few examples; like them, many more scientists, radiologists, and physicians worked day and night to find a suitable and effective way for proper detection of lung cancer so that we can save thousands of lives that we lost due to lung cancer. Sedik et al.27 proposed for the detection of coronavirus. Chen et al.28 proposed model for the detection of emotion detection. In recent years, there has been an increase in research efforts where researchers have tried to do an intersection of deep learning and medical imaging which is used to find an efficient way for the early detection of adenocarcinoma, also try to find different methods with which we are able to perform diagnosis of lung cancer and can be able to teat lung cancer in an efficient and a more cost-effective way. To implement the multiple model in parallel, shell scripting29 has contributed a major role for us. Gao et al.30 proposed affect of threat detection on the image segmentation. Hammad et al.31 proposed a model for arrhythmia detection using deep learning.
Several studies have explored deep learning approaches for lung cancer classification using CT scan images. Venkatesh et al.32 used a combination of K-Means clustering and CNN on public and private datasets, achieving an impressive accuracy of 99.967% with an MSE33 of 0.031. Rehan Raza et al.34 utilized EfficientNetB1-B4 with transfer learning in the IQ-OTH / NCCD dataset, reporting an accuracy of 99.01% and an AUC of 0.99. Similarly, Sachikanta Dash et al.35 applied EfficientNet with an autoencoder on the same dataset, achieving an accuracy of 98.98% and an AUC of 0.9872. Murthy et al.36 introduced a fuzzy-based Efficient Residual Network for classification using the LIDC-IDRI dataset, achieving 93.2% accuracy, 94.8% true positive rate, and 92.6% precision. Sultana et al.37 experimented with MobileNetV2, VGG19, and ResNet50 on chest CT and PET-CT images, reaching an accuracy of 98.67%. Nahed Tawfik et al.38 combined the CLAHE algorithm with Xception and EfficientNet on a public CT dataset, achieving a specificity of 99.68% and an accuracy of 99.03%. Mohammad Q. Shatnawi et al.39 tested ConvNeXt, VGG16, and ResNet50 on a public CT dataset, obtaining 99% accuracy and 99.2% precision. Lastly, Tolgahan Gulsoy et al.40 proposed FocalNeXt on the IQ-OTH/NCCD dataset, achieving high sensitivity (99.78%), recall (99.36%), F1 score (99.56%) and an overall accuracy of 99.81%. Flyckt et al.41 Dynamic Ensemble Selection achieving AUC of \(0.77\pm 0.01\) using standard blood tests and patient history data, incorporating multimodal analysis for improved prediction accuracy. Zia et al.42 Dual Attention CNN showed improved performance through channel and spatial attention mechanisms, particularly excelling in identifying small nodules with a 92% detection rate for early stage cancers. Shah et al.43 introduce a Deep Ensemble 2D CNN approach for lung nodule detection, utilizing multiple CNNs and the LUNA16 dataset to achieve enhanced accuracy in cancer screening. The proposed architecture combines three distinct CNNs with varying configurations, demonstrating 95% accuracy and outperforming the baseline methods.
Although these studies demonstrate the effectiveness of various deep learning models, there is a need to improve classification performance by utilizing ensemble techniques. The proposed research aims to introduce a weighted average ensemble44 based on the flower pollination algorithm (FPA) to improve classification accuracy. By optimally integrating multiple models, the ensemble approach seeks to achieve better generalization and robustness in the classification of lung cancer. Table 1 summarizes recent work on the detection of lung cancer.
Materials and methods
Description of dataset
We use the chest CT-Scan image dataset45, which is publicly available for research. The dataset consists of images categorized into normal lungs without cancer, Adenocarcinoma, Large cell carcinoma, and Squamous cell carcinoma. The data consists of CT scan images classified according to the presence and type of cancer. Adenocarcinoma is the most common form of lung cancer, accounting for about 30% of all cases. Large cell carcinoma represents approximately 10–15% of non-small cell lung cancers. Squamous Cell Carcinoma, closely linked to smoking, accounts for approximately 30% of nonsmall cell lung cancers. On the contrary, normal CT scans serve as a baseline for healthy lung tissue, helping to distinguish between malignant and nonmalignant cases in diagnostic imaging. The data set includes 1000 CT scans in.jpg and.png formats. It contains four types of lung cancer: adenocarcinoma with 338 scans, large cell carcinoma with 187 scans, and squamous cell carcinoma with 260 scans. In addition, there are 215 scans of normal cells.The input image is resized to \(224\times 224\) pixels, and its pixel intensities are normalized by scaling them between 0 and 1. This has been achieved by dividing each pixel value by 255.0, reducing computational complexity and ensuring consistency in data processing. As part of data pre-processing, the training images undergo augmentation using the ImageDataGenerator to enhance model generalization. we have applied augmentation over the dataset. It has included 10 degree rotation, width and height shifts of up to 20%. Shearing, zooming, and horizontal flipping, ensuring variability in the data set.These transformations help prevent overfitting by exposing the model to diverse variations in the input data. However, no augmentation is applied to validation and test images; they are only rescaled to maintain consistency during evaluation. This approach ensures that the model learns robust features while being fairly evaluated on unseen data. Figure 1 shows the distribution of data for training, validation, and testing and Fig. 2 shows sample images of the four categories.
CNNs
Convolutional Neural Networks (CNNs) are widely used in image-related applications46. These networks comprise multiple distinct layers: input, convolution, activation, pooling, and fully connected layers. The role of convolution layers is to extract features from the input data. Activation functions like ReLU introduce nonlinearity into the network. The combination of layers serves to reduce the number of parameters, thereby lessening the likelihood of overfitting. Finally, fully-connected layers are responsible for making predictions, which are completed using a softmax classifier.
In experimental studies, pre-trained CNN models47 based on the architecture of ResNet, MobileNet, DenseNet, VGG, and Inception have been evaluated. These models leverage unique building blocks such as residual connections, dense blocks, inception modules, and separable convolutions, enabling them to learn complex features effectively. The distinct design of each architecture helps to optimize performance for various image processing tasks.
Flower pollination algorithm (FPA) for optimizing ensemble weights
The flower pollination algorithm (FPA) is a natural-inspired optimization method that emulates the flower pollination process48. This study uses FPA to identify the optimal weights for a weighted ensemble made up of three main models: \(m_1, m_2, m_3\). The aim is to discover the weight combination that improves the accuracy of the classification. Optimization involves the following stages and Complete Process is shown in Algorithm 1.
Phase 1: Initialization A population of candidate weight vectors \(W_i\) is initialized, where \(i = 1, 2, \dots , n\), and each weight vector is represented as:
Here, \(w_1, w_2, w_3\) are the weights assigned to the models \(m_1, m_2,\) and \(m_3\), respectively. The sum constraint ensures a valid probability distribution.
Phase 2: Global pollination Global pollination enables exploration using Lévy flights:
where \(W^*\) is the best weight vector found so far, \(L(\beta )\) represents the Lévy flight step size, and \(\lambda\) is the scaling factor. This phase allows significant exploration in the search space to avoid local optima.
Phase 3: Local pollination Local pollination mimics self-pollination within similar solutions:
where \(W_j^t\) and \(W_k^t\) are two randomly selected weight vectors, and \(\epsilon\) is a random value from \(U(0,1)\). This mechanism enables fine-tuning of weight adjustments.
Phase 4: Switching probability A probability \(p \in [0,1]\) determines whether global or local pollination is performed:
This balance between exploration and exploitation ensures efficient optimization.
Phase 5: Fitness evaluation The fitness function \(f(W)\) evaluates the classification performance of the weighted ensemble using:
The weights are optimized to maximize accuracy or other performance metrics. The best weight vector \(W^*\) is updated iteratively as:
Phase 6: Stopping criterion The optimization stops when a predefined maximum number of iterations \(T\) is reached or when the change in the fitness function falls below a tolerance threshold.
Transfer learning
Transfer learning involves employing a model already trained on one task to address another related challenge. This method capitalizes on the insights gained from the initial task to enhance learning efficiency in the new scenario. Through transfer learning, the proposed framework attains superior generalization and robustness49. The pretrained base classifiers, originally trained on the ImageNet dataset50, underwent fine-tuning, with their classification layers adjusted to fit the particular class structure.
Ensemble of CNN models
Ensemble learning involves the aggregation of predictions from multiple models trained on the same data set, leading to improved predictive accuracy51. The fundamental concept is to strategically integrate base models to form a more robust final model. Using ensemble methods aids in reducing model variance and errors and often results in superior performance compared to individual models alone. When applied to deep CNN architectures, ensemble methods take advantage of the feature extraction capabilities of each model, thus improving generalization performance. Standard ensemble techniques include bagging, stacking, voting, and prediction averaging. In particular, the average ensemble is widely utilized for classification tasks. Our methodology improves the significance of the model employing a weighted ensemble approach instead of the conventional average ensemble52. Figure 3 illustrates the weighted average ensemble method, which combines predictions from multiple models by assigning varying weights according to their performance. The process includes the following steps:
-
1.
Train multiple individual CNN models on the lung cancer dataset.
-
2.
Each model generates its prediction for testing data.
-
3.
Assign weights to models based on predefined criteria, such as accuracy or confidence.
-
4.
Compute the final prediction as a weighted sum of the individual prediction models.
Mathematically, the final prediction of the ensemble \({\hat{y}}\) is given by:
where:
-
\(N\) is the number of models,
-
\(w_i\) is the weight assigned to the \(i\)-th model, ensuring \(\sum _{i=1}^{N} w_i = 1\),
-
\({\hat{y}}_i\) is the prediction of the \(i\)-th model.
This approach ensures that models with higher reliability contribute more to the final decision, improving overall accuracy.
Performance measurement metrics
Various performance metrics are used to evaluate deep learning models. The confusion matrix is used to calculate accuracy, precision, recall, and F1 score53. The mathematical formulations for these performance metrics are given by Eqs. (4)–(7):
where \(TP\): true positive; \(TN\): true negative; \(FP\): false positive; \(FN\): false negative54.
Proposed system design
Pre-trained CNN models
In CNN models, convolutional filters facilitate the identification and learning of image features without the need for preliminary feature extraction. This research has analyzed the performance of six pre-trained CNN architectures to evaluate the efficacy of CNN model ensembles. The examined architectures are DenseNet121, MobileNet, Xception, VGG16, InceptionV3, and ResNet101V2. These models are evaluated to showcase their capabilities in various classification tasks.
FPA algorithm-based model weighting
Flower Pollination Algorithm (FPA) is an optimization method derived from the natural flower pollination process. This technique utilizes the concepts of both local and global pollen transfer. Local pollination involves self-pollination or interactions between nearby flowers, while global pollination occurs via distant pollen transfer facilitated by long-range pollinators. The aim of the FPA is to thoroughly investigate the search space and determine the optimal solutions by achieving a balance between exploration and exploitation. In the proposed model, the initial set of solutions is represented as a population of flowers, each characterized by a set of weights. These weights undergo iterative adjustment according to the FPA principles. Model performance is assessed through fitness values based on accuracy measures. The algorithm generates new candidate solutions using combined local and global pollination strategies, choosing the top solutions based on fitness scores. Through continuous iterations, the most effective solutions are preserved, while ineffective ones are eliminated. After multiple generations, the final optimal solution is identified as the best result discovered. The FPA-refined weights are then integrated into the ensemble classifier, enhancing its overall performance.
Proposed FPA based weighted ensemble classifier
The study presented involves the development of a lung cancer detection strategy utilizing deep-ensemble learning. This approach integrates several individual classifiers to formulate a prediction model that is both more precise and reliable. The typical procedure for creating an ensemble classifier consists of selecting base classifiers, training these classifiers, constructing the ensemble, and assessing its performance, as detailed below.
-
1.
Best 3 classifier selection: In this study, six pre-trained models were applied to the lung cancer dataset and evaluated using 5-fold cross-validation to ensure robustness and reduce overfitting. Each model was trained and validated across five different splits of the dataset, and the performance metrics were averaged over all folds. These models encompass diverse architectures to effectively capture different data attributes. Based on the mean performance across the folds, the three leading models were selected for ensemble development, with the flower pollination algorithm (FPA) used to determine their optimal weights.
-
2.
Ensemble construction: The ensemble prediction is achieved by integrating the output of the top three CNN models using a weighted average approach, where the weights are determined by FPA optimization.
-
3.
Ensemble evaluation: The ensemble’s performance has been assessed using testing data and relevant metrics.
Figure 4 presents a simulation of the weighted ensemble model based on the Flower Pollination Algorithm (FPA) for lung cancer detection using CT images, with further details provided in Algorithm 2.
Experimental results and discussion
The research conducted included a series of experimental setups. Initially, the top three convolutional neural network (CNN) architectures were selected for their high classification efficacy. To build an ensemble model, a weighted averaging strategy was implemented. Various optimization techniques55 were used to determine the optimal ensemble weights. Analysis through confusion matrices and performance graphs indicated that this framework yields superior outcomes. The subsequent section evaluates the effectiveness of the proposed methodology compared to existing studies on lung cancer detection using pre-trained CNN models. In the experiments, an initial learning rate of 0.01 and a mini-batch size of 64 were employed, with Stochastic Gradient Descent serving as the optimization method. The training was completed after 100 epochs, with no indication of overfitting the network. For the classification layer within deep CNN models, categorical cross-entropy loss was used to train the weighted ensemble model in conjunction with comparison models. All experiments were conducted on the Kaggle GPU platform.
Results
Results of pretrained CNN models
Table 2 compares the performance of six CNN models for Lung Cancer Detection tasks based on accuracy, precision, recall, and F1 score metrics. Among the models, VGG16 and InceptionV3 delivered the highest performance with an accuracy of 94.6% and 94.0%, respectively, along with excellent precision, recall and F1 score values, indicating a robust ability to classify images accurately. MobileNet also demonstrated high performance, with an accuracy of 91.7% and consistent precision and recall. However, DenseNet121 achieved moderate results with an accuracy of 86.7%, while Xception showed the lowest overall performance, with an accuracy of 80.6%. ResNet101V2 offered a balanced trade-off with a strong accuracy of 92. 7%, suggesting that it is reliable for high-performance classification tasks. For a visual representation of the classification results, the confusion matrices of all CNN models at the end of the testing process are presented in Fig. 6
Based on the classification results of the pre-trained CNN models shown in Table 2 and illustrated in Fig. 5, VGG16, InceptionV3, and ResNet101V2 emerge as the leading models, achieving accuracy levels of 94.6%, 94.0% and 92.7%, respectively. From comparison of execution time and model size, it is clear that MobileNet has the smallest model size (19.1 MB) and the fastest inference time (6.72 s), and ResNet101V2 and Xception have the highest training times (1326.3 s and 1314.86 s, respectively), while VGG16 is the largest model at 192 MB, requiring more storage and computational power.
Performance comparison with other optimization algorithm
In ensemble learning, numerous methodologies are available to determine the optimal weights for classifiers, typically using optimization strategies to improve the prediction accuracy of the ensemble model. To determine the most effective ensemble weights, a variety of optimization techniques was evaluated, including the Flower Pollination Algorithm (FPA), Particle Swarm Optimization (PSO), Bayesian Optimization (BO), Artificial Bee Colony (ABC) and Ant Colony Optimization (ACO). Table 3 displays the performance metrics of these optimization methods when applied to the classification of lung cancer, with the associated confusion matrices depicted in Fig. 7. The ensemble’s weights were fine-tuned to enhance the model’s effectiveness, ensuring that each classifier’s input was balanced. Among the various algorithms evaluated, FPA was identified as the most effective in redefining the weights of the ensemble, as detailed in Table 3. The ensemble model employing FPA weights achieved the highest classification accuracy for detecting Lung Cancer, with a value of 98.2%, exceeding other methods presented in Table 3.
The ensemble weights of the proposed model were derived from the optimal weights identified by the FPA. As shown in Table 4, these weights formed a combination that enhanced the ensemble model’s predictive performance to its maximum capability. The weights, determined using various optimization methods, collectively sum to 1. Specifically, the FPA algorithm assigned the following weights: 0.17642552 for ResNet101V2, 0.3318414 for InceptionV3, and 0.49173307 for VGG16. Notably, the VGG16 model received the highest weight because its prediction results, derived from softmax layer outputs, typically surpass those of the other models.
Proposed FPA ensemble model
Among the various CNN architectures, VGG16, InceptionV3, and ResNet101V2 stand out, with classification accuracies of 94.6%, 94.0%, and 92.7%, respectively. However, the FPA-based ensemble method surpasses these models, achieving an accuracy of 98.2% on the test dataset. As detailed in Table 3, FPA fine-tunes the ensemble weights to achieve maximum accuracy. The optimal weight combination utilizing FPA, shown in Table 4, yields better classification results. Figs. 8 and 9 represent the accuracy and loss metrics per epoch for the three top CNN models throughout training. There is minimal variation between training and validation accuracy and loss values, demonstrating the model’s ability to avoid overfitting and its capacity to generalize effectively to new data. Figure 10 shows the AUC(ROC) and AUC(PR) comparison of the top 3 base models with the proposed FPA-weighted ensemble.
Various CNN models tend to focus on unique patterns or data features during their training processes. By combining the predictions of multiple CNN models, the ensemble approach achieves higher performance compared to each model working in isolation. This enhanced performance is primarily due to the ensemble’s ability to effectively generalize to novel, unseen instances. Figure 11 presents the test results for the three highest-performing CNN models along with the FPA-weighted ensemble model.
The following key observations can be made:
-
1.
The FPA-based weighted ensemble model surpasses standalone CNN models by reaching an accuracy rate of 98.2%. This result underscores the enhanced predictive capability of the ensemble model overall.
-
2.
The proposed ensemble model achieves a precision of 98.4%, which is notably superior to the top performing individual CNN classifier, VGG16, which registers a precision of 95.0%. This increased precision of the ensemble model suggests a reduction in false-positive errors.
-
3.
The proposed model demonstrates a recall rate of 98.6%, surpassing the performance of the standalone CNN models. This increased recall indicates that the ensemble model is more effective in identifying a greater number of positive cases relative to the individual classifiers.
-
4.
The proposed model achieves an F1 score of 98.5%, demonstrating a more effective balance between precision and recall compared to the CNN models in isolation.
In summary, the proposed FPA weighted Ensemble Model outperforms individual CNN models in terms of accuracy, precision, recall, and F1 score. This model demonstrates superior overall performance, effectively balancing precision and recall. The enhanced performance can be attributed to the ensemble methodology, which combines various classifiers, thus increasing precision compared to the use of single classifiers.
Comparison of proposed model with existing work
This subsection evaluates the effectiveness of the proposed FPA-based Ensemble technique relative to other approaches using the lung cancer dataset. As illustrated in Table 3, the FPA-based ensemble model achieves an impressive classification accuracy of 98.84% for the detection of lung cancer, surpassing the results of other studies reported in the literature. Furthermore, Fig. 12 shows that this proposed ensemble model consistently outperforms all other comparable studies using similar datasets.
Discussion
This study introduces a novel FPA-based weighted ensemble model (Fig. 4) to detect and classify Lung Cancer. The model was evaluated using datasets that contain images of lung cancer classes (Fig. 2).
Convolutional neural networks (CNNs) are widely recognized for their efficacy in solving image-based problems, offering satisfactory performance because of their robust feature extraction capabilities. To assess the effectiveness of the proposed model, a comparative analysis of six pre-trained CNN models was performed. In particular, VGG16 achieved the highest accuracy of 94.6% (Table 2), primarily attributed to its dense block architecture that improves both feature extraction and adaptability.
An FPA-based weighted ensemble model was then designed to demonstrate how combining multiple deep CNN models can leverage the individual feature extraction strengths of each model, resulting in enhanced generalization. This ensemble integrates the predictions of the top three CNN models, VGG16, InceptionV3 and ResNet101V2, which achieved classification precision of 94.6%, 94.0% and 92.7%, respectively (Fig. 5).
The key advantage of the proposed FPA-based weighted ensemble approach lies in its ability to combine the strengths of multiple high-performing CNN architectures while mitigating their individual weaknesses. Traditional ensemble techniques often rely on static or equally distributed weights, which may not reflect the true contribution of each model to classification performance. Using the flower pollination Algorithm (FPA) for weight optimization, the proposed method adaptively assigns importance to each model based on its effectiveness, leading to improved accuracy and robustness. Furthermore, this technique improves model generalization and reduces overfitting, as the diversity among CNN architectures such as VGG16, InceptionV3, and ResNet101V2 ensures richer feature representation and better decision boundaries. This makes the model highly suitable for real-world medical imaging applications where diagnostic reliability is paramount
Performance results validate that the proposed ensemble model not only outperforms individual CNNs but also presents a scalable and robust approach to automated lung cancer detection. These results underscore the potential of integrating metaheuristic optimization with deep learning to improve classification performance in medical diagnostics.
TOPSIS analysis
To check the effectiveness of CNN models in lung cancer detection, we have conducted TOPSIS analysis to classify the classifiers according to all evaluation metrics. It is a multi-criteria decision analysis method that is used to rank and select alternatives56. This study applies TOPSIS to compare various Convolutional Neural Network (CNN) models based on four performance criteria: Accuracy, Precision, Recall, and F1 Score.
Step 1: Constructing the decision matrix The decision matrix \(D\) consists of alternatives (models) and criteria:
where \(d_{ij}\) represents the performance of the model \(i\) under criterion \(j\).
Step 2: normalization using L2 vector normalization Each element of the matrix is normalized using:
This ensures that all criteria are dimensionless and comparable.
Step 3: determine the best ideal and worst The Best Ideal Values (\(A^+\)) and the Worst Ideal (\(A^-\)) are calculated as:
where \(J^+\) are benefit criteria (higher is better) and \(J^-\) are cost criteria (lower is better).
Step 4: calculate Euclidean distances The separation distances from the best ideal and the worst ideal are computed as:
where
-
\(D_i^+\) is the distance from the best ideal
-
\(D_i^-\) is the distance from the worst ideal.
Step 5: compute the TOPSIS score (relative closeness) The relative closeness is calculated as:
where \(0 \le C_i \le 1\). A higher \(C_i\) indicates a better alternative.
Step 6: rank the alternatives Models are ordered by their TOPSIS scores from highest to lowest. The model achieving the largest \(C_i\) is assigned the top rank.
TOPSIS analysis evaluates CNN models based on their performance metrics by computing their Euclidean distances from the ideal best \(\left( A^+\right)\) and ideal worst \(\left( A^-\right)\) solutions. From Table 5 it is found that the FPA-weighted model achieves the highest relative closeness (1.000), which means that it is closest to the ideal solution. The ABC and BO weighted models follow with \(0.900540\) and \(0.817904\), respectively, making them the second and third best performers. Xception, with the highest \(D_i^+\) (\(0.106076\)) and the lowest \(D_i^-\) (\(0.000\)), ranks last. Figure 13 shows the comparison of TOPSIS analysis for all CNN models considered in this study. This analysis helps to select the most balanced model considering multiple performance criteria rather than relying solely on accuracy.
Conclusion
Lung cancer remains one of the most life-threatening diseases and contributes significantly to global mortality. Early and accurate detection using Computed Tomography (CT) imaging is critical for improving patient outcomes. This study introduces an FPA-based weighted ensemble model that integrates three robust pre-trained CNN architectures: VGG16, ResNet101V2, and InceptionV3, where the flower pollination algorithm (FPA) optimally determines the weights of the ensemble. The primary innovation lies in the synergistic integration of diverse CNN models with an evolutionary optimization technique, which not only enhances classification performance but also ensures robustness across varying data characteristics. The proposed model surpasses the performance of individual CNN classifiers, achieving a remarkable accuracy of 98.2%, precision of 98.4%, recall of 98.6%, and an F1 score of 98.5%. These results demonstrate the effectiveness of deep learning ensembles, particularly when optimized with metaheuristic techniques like FPA, in improving diagnostic reliability.
Despite the promising results, the limited dataset size (1,000 CT scans) may introduce biases and affect the generalizability of the model. Future work should focus on validating the approach using larger and more diverse datasets to ensure robustness across populations. Additionally, integrating attention mechanisms and hybrid optimization strategies could further boost accuracy and interpretability, making the framework more applicable in real-world clinical settings. The proposed model can also serve as a decision-support tool for radiologists, aiding early diagnosis, reducing errors, and supporting effective treatment planning, thus improving patient prognosis and reducing healthcare burden.
Data availability
The datasets generated and analyzed during the current study are available in the Kaggle repository, https://www.kaggle.com/datasets/mohamedhanyyy/chest-ctscan-images
References
Uthoff, J. et al. Machine learning approach for distinguishing malignant and benign lung nodules utilizing standardized perinodular parenchymal features from CT. Med. Phys. 46(7), 3207–3216 (2019).
Lung Cancer: Statistics. https://www.cancer.net/cancer-types/lung-cancer/statistics. Accessed 1 Mar 2024 (2024).
World Health Organization. The World Health Report 2001: Mental Health: New Understanding, New Hope. 2001. https://apps.who.int/iris/handle/10665/42390. Accessed 9 Dec 2021 (2021).
Zhang, Y., Oikonomou, A., Wong, A., Haider, M. A. & Khalvati, F. Radiomics-based prognosis analysis for non-small cell lung cancer. Sci. Rep. 7(1), 46349 (2017).
Wang, H., Li, Z., Li, Y., Gupta, B. B. & Choi, C. Visual saliency guided complex image retrieval. Patt. Recognit. Lett. 130, 64–72 (2020).
Broocks, G. et al. Computed tomography-based imaging of voxel-wise lesion water uptake in ischemic brain: Relationship between density and direct volumetry. Investig. Radiol. 53(4), 207–213 (2018).
Weiskopf, N., Edwards, L. J., Helms, G., Mohammadi, S. & Kirilina, E. Quantitative magnetic resonance imaging of brain anatomy and in vivo histology. Nat. Rev. Phys. 3(8), 570–588 (2021).
Li, J. et al. PRNU anonymous algorithm used for privacy protection in biometric authentication systems. Int. J. Semantic Web Inf. Syst. (IJSWIS) 19(1), 1–19 (2023).
Jena, B., Nayak, G. K. & Saxena, S. An empirical study of different machine learning techniques for brain tumor classification and subsequent segmentation using hybrid texture feature. Mach. Vis. Appl. 33(1), 6 (2022).
World Health Organization. Cancer Fact Sheet. https://www.who.int/news-room/fact-sheets/detail/cancer. Accessed 1 Mar 2024 (2023).
Chui, K. T., Gupta, B. B., Alhalabi, W. & Alzahrani, F. S. An MRI scans-based Alzheimer’s disease detection via convolutional neural network and transfer learning. Diagnostics 12(7), 1531 (2022).
Acharya, U. R., Faust, O., Sree, S. V., Molinari, F. & Suri, J. S. ThyroScreen system: High resolution ultrasound thyroid image characterization into benign and malignant classes using novel combination of texture and discrete wavelet transform. Comput. Methods Prog. Biomed. 107(2), 233–241 (2012).
Kuppili, V. et al. Extreme learning machine framework for risk stratification of fatty liver disease using ultrasound tissue characterization. J. Med. Syst. 41, 1–20 (2017).
Khan, A. A., Mahendran, R. K., Perumal, K. & Faheem, M. Dual-3DM 3 AD: mixed transformer based semantic segmentation and triplet pre-processing for early multi-class Alzheimer’s diagnosis. IEEE Transactions on Neural Systems and Rehabilitation Engineering 32, 696–707 (2024).
Alhussen, A., Haq, M. A., Khan, A. A., Mahendran, R. K. & Kadry, S. XAI-RACapsNet: Relevance aware capsule network-based breast cancer detection using mammography images via explainability O-net ROI segmentation. Expert Systems with Applications 261, 125461 (2025).
Khan, A. A., Madendran, R. K., Thirunavukkarasu, U. & Faheem, M. D2PAM: Epileptic seizures prediction using adversarial deep dual patch attention mechanism. CAAI Transactions on Intelligence Technology 8(3), 755–769 (2023).
Bagla, P. & Kumar, K. TA-WHI: Text analysis of web-based health information. Int. J. Softw. Sci. Comput. Intell. (IJSSCI) 15(1), 1–14 (2023).
Leghari, I. M. & Ali, S. A. Artificial intelligence techniques to improve cognitive traits of Down syndrome individuals: An analysis. Int. J. Softw. Sci. Comput. Intell. (IJSSCI) 15(1), 1–11 (2023).
Dai, J., Li, Y., He, K., & Sun, J. R-fcn: Object detection via region-based fully convolutional networks. Adv. Neural Inf. Process. Syst.29 (2016).
El-Baz, A., & Suri, J. S. Neurological Disorders and Imaging Physics, Volume 2: Engineering and Clinical Perspectives of Multiple Sclerosis. (IOP Publishing, 2019).
Schmidhuber, J. Deep learning in neural networks: An overview. Neural Netw. 61, 85–117 (2015).
Tan, J., Zhou, W., Lin, L. & Jumahong, H. A review of semantic medical image segmentation based on different paradigms. Int. J. Semantic Web Inf. Syst. (IJSWIS) 20(1), 1–25 (2024).
LeCun, Y., Bengio, Y. & Hinton, G. Deep learning. Nature 521(7553), 436–444 (2015).
Xu, Y., Xie, L., Huang, H., Yu, F. & Zhao, T. Semantic-driven ghosting-free image fusion using dual gain video stream with FPGA. Int. J. Semantic Web Inf. Syst. (IJSWIS) 21(1), 1–21 (2025).
Hammad, M., Alkinani, M. H., Gupta, B. B., & Abd El-Latif, A. A. Myocardial infarction detection based on deep neural network on imbalanced data. Multimed. Syst. 1–13 (2022).
Kang, L., Kumar, J., Ye, P., Li, Y., & Doermann, D. Convolutional neural networks for document image classification. In 2014 22nd International Conference on Pattern Recognition. 3168–3172. (IEEE, 2014).
Sedik, A., Hammad, M., Abd El-Samie, F. E., Gupta, B. B., & Abd El-Latif, A. A. Efficient deep learning approach for augmented detection of coronavirus disease. Neural Comput. Appl. 1–18 (2022).
Chen, X., Wang, J. & Zhu, L. Deep learning-driven multi-stage product visual emotional design utilizing web semantics. Int. J. Semantic Web Inf. Syst. (IJSWIS) 20(1), 1–29 (2024).
Singh, K. S. Linux Yourself: Concept and Programming. 1st Ed. (Chapman and Hall/CRC, 2021). https://doi.org/10.1201/9780429446047.
Gao, S., Cheng, Y., Mao, S., Fan, X. & Deng, X. SSVEP-enhanced threat detection and its impact on image segmentation. Int. J. Semantic Web Inf. Syst. (IJSWIS) 20(1), 1–20 (2024).
Hammad, M. et al. Deep learning models for arrhythmia detection in IoT healthcare applications. Comput. Electr. Eng. 100, 108011 (2022).
Venkatesh, C. et al. A hybrid model for lung cancer prediction using patch processing and deep learning on CT images. Multimed. Tools Appl. 83(15), 43931–43952 (2024).
Sh., K. Kumar, M. “Predictive Analysis of Groundwater Resources Using Random Forest Regression,” Journal of Artificial Intelligence and Metaheuristics, vol. 9, Issue 1. , pp. 11-19, 2025. DOI: https://doi.org/10.54216/JAIM.090102
Raza, R. et al. Lung-EffNet: Lung cancer classification using EfficientNet from CT-scan images. Eng. Appl. Artif. Intell. 126, 106902 (2023).
Dash, S., Padhy, S., Suman, P. & Das, R. K. Enhancing lung cancer diagnosis through CT scan image analysis using mask-EffNet. Eng. Access 11(1), 92–107 (2025).
Murthy, N. N. & Thippeswamy, K. Fuzzy-ER Net: Fuzzy-based efficient residual network-based lung cancer classification. Comput. Electr. Eng. 121, 109891 (2025).
Sultana, Z., Foysal, M., Islam, S., & Al Foysal, A. Lung cancer detection and classification from chest CT images using an ensemble deep learning approach. In 2024 6th International Conference on Electrical Engineering and Information & Communication Technology (ICEEICT). 364–369. (IEEE, 2024).
Tawfik, N. et al. Enhancing early detection of lung cancer through advanced image processing techniques and deep learning architectures for CT scans. Comput. Mater. Contin. 81(1), 271–307 (2024).
Shatnawi, M. Q., Abuein, Q. & Al-Quraan, R. Deep learning-based approach to diagnose lung cancer using CT-scan images. Intell.-Based Med. 11, 100188 (2025).
Gulsoy, T. & Kablan, E. B. FocalNeXt: A ConvNeXt augmented FocalNet architecture for lung cancer classification from CT-scan images. Expert Syst. Appl. 261, 125553 (2025).
Flyckt, R. N. H. et al. Pulmonologists-level lung cancer detection based on standard blood test results and smoking status using an explainable machine learning approach. Sci. Rep. 14(1), 30630 (2024).
UrRehman, Z. et al. Effective lung nodule detection using deep CNN with dual attention mechanisms. Sci. Rep. 14(1), 3934 (2024).
Shah, A. A., Malik, H. A. M., Muhammad, A., Alourani, A. & Butt, Z. A. Deep learning ensemble 2D CNN approach towards the detection of lung cancer. Sci. Rep. 13(1), 2987 (2023).
Arora, L. et al. Ensemble deep learning and EfficientNet for accurate diagnosis of diabetic retinopathy. Sci Rep 14, 30554. https://doi.org/10.1038/s41598-024-81132-4 (2024).
Chest CT Scan Dataset. https://www.kaggle.com/datasets/mohamedhanyyy/chest-ctscan-images.
Lee, B. et al. Breath analysis system with convolutional neural network (CNN) for early detection of lung cancer. Sens. Actuators B Chem. 409, 135578 (2024).
Ansari, M. M., Kumar, S., Tariq, U., Heyat, Belal Bin & M., Akhtar, F., Bin Hayat, M. A. & Pomary, D. Evaluating CNN architectures and hyperparameter tuning for enhanced lung cancer detection using transfer learning. J. Electr. Comput. Eng.2024(1), 3790617 (2024).
Yang, X. S. Flower pollination algorithm for global optimization. In International Conference on Unconventional Computing and Natural Computation. 240–249. (Springer, 2012).
Mammeri, S., Amroune, M., Haouam, M. Y., Bendib, I. & Corrêa Silva, A. Early detection and diagnosis of lung cancer using YOLO v7, and transfer learning. Multimed. Tools Appl. 83(10), 30965–30980 (2024).
Deng, J., Dong, W., Socher, R., Li, L. J., Li, K., & Fei-Fei, L. Imagenet: A large-scale hierarchical image database. In 2009 IEEE Conference on Computer Vision and Pattern Recognition. 248–255. (IEEE, 2009) .
Quasar, S. R. et al. Ensemble methods for computed tomography scan images to improve lung cancer detection and classification. Multimed. Tools Appl. 83(17), 52867–52897 (2024).
Jain, R., Singh, P., Abdelkader, M. & Boulila, W. Efficient lung cancer detection using computational intelligence and ensemble learning. Plos one 19(9), e0310882 (2024).
Başaran, E., Cömert, Z., Şengür, A., Budak, Ü., Çelik, Y., & Toǧaçar, M. Chronic tympanic membrane diagnosis based on deep convolutional neural network. In 4th International Conference on Computer Science and Engineering (UBMK). 1-4. ( IEEE, 2019).
Sertkaya, M. E., Ergen, B., & Togacar, M. Diagnosis of eye retinal diseases based on convolutional neural networks using optical coherence images. In 2019 23rd International Conference Electronics. 1–5. (IEEE, 2019).
Atteia, G. et al. Adaptive dynamic dipper throated optimization for feature selection in medical data. Comput. Mater. Contin. 75(1), 1883–1900 (2023).
Lakshmi, K. V. & Kumara, K. U. A novel randomized weighted fuzzy AHP by using modified normalization with the TOPSIS for optimal stock portfolio selection model integrated with an effective sensitive analysis. Expert Syst. Appl. 243, 122770 (2024).
Acknowledgements
This project was funded by Deanship of Scientific Research (DSR) at King Abdulaziz University, Jeddah under grant No. (IFPIP-1127-611-1443), the authors, therefore, acknowledge with thanks DSR technical and financial support.
Author information
Authors and Affiliations
Contributions
Final Manuscript Revision, funding, Supervision: Varsha Arya, Wadee Alhalabi, Brij B. Gupta, Turki A. Althaqafi; study conception and design, analysis and interpretation of results, methodology development: Achin Jain, Arun Kumar Dubey, Liang Zhou; data collection, draft manuscript preparation, figure and tables: Sunil K. Singh, Neha Gupta, Arvind Panwar, Sudhakar Kumar
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Zhou, L., Jain, A., Dubey, A.K. et al. FPA-based weighted average ensemble of deep learning models for classification of lung cancer using CT scan images. Sci Rep 15, 19369 (2025). https://doi.org/10.1038/s41598-025-02015-w
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41598-025-02015-w