Introduction

Recent reports from various medical organizations highlight a concerning rise in lung cancer deaths1. Cancer is the outcome of an aberration in the regular regulation of cellular growth38. Lung cancer remains the leading cause of cancer-related mortality worldwide, accounting for approximately 18% of all cancer deaths, which translates to an estimated 1.8 million fatalities annually2. Nearly a quarter of all cancer deaths are attributed to lung cancer, with 82% of cases directly linked to smoking2. Cancer can develop in any part of the body and affects individuals of all ages3. The most common types of cancer include lung, breast, colorectal, liver, and stomach cancers3. Lung cancer specifically originates in the lungs, a pair of organs located in the chest responsible for respiration, which involves the exchange of oxygen and carbon dioxide in the body3. According to recent studies by the World Health Organization (WHO), lung cancer remains a leading cause of global mortality, responsible for nearly 7.6 million deaths annually4. Medical imaging techniques, such as Magnetic Resonance Imaging (MRI) and Computed Tomography (CT) scanning, play a crucial role in providing detailed anatomical images of the lungs5, which enables the detection of abnormalities like lung nodules small tissue masses that appear as white shadows on CT scans and X-rays6. In addition to imaging, diagnostic methods such as biopsy, along with electronic modalities like CT and ultrasound, are employed for tissue examination and cancer detection7. To assess lung health, a combination of imaging techniques, including CT, Positron Emission Tomography (PET), MRI, and X-ray, are commonly used8. Treatment options for lung cancer typically involve a multidisciplinary approach, including surgery, chemotherapy, radiation therapy, targeted therapy, immunotherapy, or a combination of these treatments9.

The existing procedure for diagnosing lung cancer can be time-consuming and constrained by the number of cases a pathologist can effectively analyze within a limited timeframe10. Many existing automated models rely on basic Machine Learning (ML) techniques, which often lack the sophistication needed to fully capture the complex patterns inherent in medical images11. The feature representation and the classification algorithm can be updated simultaneously using the useful samples that have been incrementally annotated36.However, the introduction of Artificial Intelligence (AI)-powered tools has the potential to develop pathological diagnostics by identifying key prognostic and predictive features, thereby supporting pathologists, pulmonologists, and thoracic oncologists in patient management as decision support systems12. Deep Learning (DL) algorithms, particularly those capable of learning high-level features directly from raw Chest X-ray (CXR) images, offer an advanced approach to this challenge13. ML methods are already widely utilized for medical image analysis, which utilizes pattern recognition and classification to assist in computer-aided diagnosis14. Specifically, Convolutional Neural Networks (CNNs) have become a foundational tool for the in-depth interpretation and analysis of medical imaging data15. A promising architecture, DenseNet (Densely Connected Convolutional Networks), has proven effective in various image-related tasks, including medical image analysis16. This model is particularly valuable in distinguishing between cancerous and normal tissues, as well as identifying the tissue origin of cancers with high accuracy17.

Early and accurate diagnosis of lung cancer is made difficult by the shortcomings of existing classification methods, which frequently include low prediction accuracy, poor feature extraction, and inefficient segmentation procedures. Due to inadequate feature representation and preprocessing, traditional models usually have trouble identifying pathogenic and benign tissues and capturing intricate lung architecture. A number of improvements have been made to the suggested CMN-ShuffleNet model to overcome these issues. Histogram Equalization (HE) is first used in the preprocessing step to enhance image contrast, which is essential for precise analysis. By enhancing feature preservation during segmentation and lowering mistakes brought on by noise and low-contrast areas, the mRRB layer improves the network’s capacity to handle complex lung shapes and irregular tumor boundaries. We further show that the mELS-PReLU activation minimizes interpretative subjectivity by permitting more adaptive and discriminative nonlinear transformations, resulting in greater distinction between non-cancer and cancer patterns.Furthermore, describe how the ILGP feature extraction’s weighted root mean power enhances sensitivity to minute gradient variations, hence resolving issues with fine-grained texture interpretation in images.In light of these advancements, this study proposes a new DL model for lung cancer classification from lung images using the CMN-ShuffleNet model.

The contributions of this research are defined below.

  • Introducing the mRRB-SegNet Model for Lobe Segmentation to segment the lobes from the pre-processed image. This model is modified with an mRRB layer and mELS-PReLU activation function. This modification improves the model’s capability to gather long-range dependencies and the accuracy of the segmentation. An enhanced M-SegNet for segmentation, are important for improving computer-aided diagnostic (CAD) systems’ scalability, accuracy, and efficiency.

  • Proposing ILGP to extract texture features. The ILGP is enhanced by incorporating the weighted root mean power to calculate an adaptive threshold. This improvement helps identify small intensity changes in pixels and increases the discriminative power of the texture features. More discriminative information regarding tumor morphology is captured by the method by utilizing gradient and shape-based features, which allows for a more accurate categorization of benign and malignant nodules.

  • Developing the CMN-ShuffleNet model for classifying lung cancer. This model is enhanced through the CMN layer and CBAM. These improvements introduce both channel and spatial attention mechanisms, which significantly boost the classification performance.

  • ShuffleNet’s lightweight design guarantees quicker inference with less computing cost, which makes it ideal for use in clinical settings, particularly those with limited resources. When combined, these developments provide a strong and effective pipeline for the early detection of lung cancer, which may enhance patient outcomes by enabling prompt and precise diagnosis.

The study on lung cancer classification is systematically designed as follows. A detailed review of existing studies and methodologies in the field of lung cancer classification is given in "Literature Review" Section. The methodology behind the CMN-ShuffleNet model for lung cancer classification is given in "Proposed methodological process for CMN-ShuffleNet model-based Lung Cancer Classification" Section. "Results and discussion" Section gives a comprehensive evaluation of the experimental outcomes attained through the CMN-ShuffleNet model, followed by the conclusion of this study in "Conclusion" Section.

Literature review

This section highlights a detailed review of eight key studies on the Lung cancer classification.

In 2023, Naseer et al.18 has developed an automated method for lung cancer detection in CT scans using computational intelligence. In 2023, Ragab et al.19 have presented a Self-Upgraded Cat Mouse Optimizer with Machine Learning Driven Lung Cancer Classification (SCMO-MLL2C) technique for lung cancer classification from CT. In 2024, Imran et al.20 has introduced a hybrid Deep Learning (DL) model which combined Convolutional Neural Network (CNNs) and Vision Transformers (ViTs) to classify Non-Small Cell Lung Cancer (NSCLC) into normal, adenocarcinoma, and Squamous Cell Carcinoma (SCC). In 2024, Mohamed and Ezugwu et al.21 has employed an innovative DL model for lung cancer detection, which integrated messenger RNA (mRNA), MicroRNAs (miRNA), and Deoxyribonucleic Acid (DNA) methylation markers. In 2024, Amin et al.22 have deployed a DL with CNN was applied to RNA-Seq, miRNA-Seq, and WSIs for lung cancer classification. In 2024, Sampangi Rama Reddy et al.23 have designed an SNN architecture to detect and classify lung cancer using CT scan images. In 2024, Uddin et al.24 has implemented two novel dense architectures, D1 and D2, designed for classifying colon and lung cancer using various image datasets. In 2024, Sangeetha et al.25 have developed a Multimodal Fusion Deep Neural Network (MFDNN) for lung cancer detection. In 2024, Uddin et al.16 have deployed DenseNet for lung cancer detection by taking use of its capacity to continually send learnt features backwards through each layer. This feature helps to better understand the structural complexity and uneven distribution in CT scans and histopathological cancer images by lowering model parameters and improving local feature learning. In 2025, Klangbunrueang et al.35 have introduced creating and assessing Convolutional Neural Network (CNN) models, specifically the Visual Geometry Group 16 (VGG16) architecture, to categorize CT scan images of lung cancer into three groups: benign, malignant, and normal. Table 1 shows the features and challenges of the existing models.

Table 1 Features and limitations in existing lung cancer classification methods.

Research gaps

The existing methods for lung cancer classification exhibited several challenges, which were emphasized in several studies. Naseer et al.18 employed the Modified U-Net for the lobe segmentation process, but this model was unable to capture long-range relationships and contextual information across the image. Similarly, Ragab et al.19 utilized DenseNet-201 for feature extraction, but this method was insufficient for capturing complex textures and patterns. In the same study, the ENN model was used for classifying lung cancer, while Sangeetha et al.25 applied the MFDNN model for classification. However, these models limited the network’s ability to capture intricate features for accurate lung cancer classification. The efficacy of the existing ShuffleNet for lung cancer classification with segmentation is limited in clinical practice by a number of issues. First, ShuffleNet’s lightweight architecture may sacrifice feature extraction depth and representation power, which could result in decreased accuracy in complicated or ambiguous scenarios, even while it delivers computational efficiency. The diagnostic context is further limited by the absence of integration with other clinical data (such as patient history and biomarkers), which may have an impact on the reliability of the judgment. Furthermore, generalizability across various imaging modalities, scanner types, and patient demographics is still an issue; implementation in various healthcare systems necessitates significant retraining or domain customization. These limitations showed the need for better deep learning methods that could more accurately identify the unique features of lung cancer. In response to these challenges, a new CMN-Shuffle model was proposed for lung cancer classification. Lung cancer classification accuracy is improved while computational economy is maintained by the lightweight yet highly discriminative feature extraction provided by the Modified ShuffleNet trained on gradient pattern and shape-based features. Together with enhanced M-SegNet segmentation, which accurately draws tumor borders, the system gains more exact region-of-interest localization, which lowers false positives and enhances diagnostic performance in general. Both fine-grained texture and structural signals are used in this integration to provide reliable and understandable lung cancer diagnosis.

Proposed methodological process for CMN-ShuffleNet model-based lung cancer classification

Lung cancer is marked by the uncontrolled growth of cells in lung tissue, which remains a significant global health issue. This often leads to substantial morbidity and mortality. Accurate and timely detection, along with proper staging, is crucial for identifying the most effective treatment and predicting patient outcomes. The CMN-ShuffleNet model-based Lung cancer classification includes image preprocessing, lobe segmentation, feature extraction, and classification phases.

  1. 1)

    Preprocessing: The input lung image performs Histogram Equalization to improve the image contrast.

  2. 2)

    Lobe Segmentation: The mRRB-SegNet model is used for segmenting the lung lobe in the pre-processed image.

  3. 3)

    Feature Extraction: Features such as ILGP, shape features and statistical features are extracted in this step.

  4. 4)

    Classification: The CMN-ShuffleNet model classifies the lung cancer based on the extracted features. The proposed method process is visually shown in Fig. 1.

Fig. 1
Fig. 1
Full size image

Framework of the proposed CMN-ShuffleNet model for Lung Cancer Classification.

Preprocessing using the histogram equalization technique

During the pre-processing phase, the input lung image \(l_{i}\)(\(l_{i} = \left\{ {l_{1} ,l_{2} ,....l_{n} } \right\}\)) is enhanced to improve its quality and make the relevant features more detectable for further analysis. One common technique used for enhancing the images is histogram equalization.

Histogram Equalization26 is a technique used to enhance the contrast of an image \(l_{i}\). The primary concept is to modify the image’s histogram to approximate a uniform distribution, which effectively enhances its contrast.

This transformation is performed by mapping the original image to a new image where the gray levels are evenly distributed with a specific transformation function. After performing histogram equalization, the CDF of the image’s gray levels is denoted as \(g_{{l_{i} }}^{CDF}\) which is formulated by the mathematical formula as given in Eq. (1).

$$g_{{l_{i} }}^{CDF} = T_{f} \left( {p_{{l_{i} }} } \right) = \sum\limits_{i = 0}^{l} {r_{fr} } \left( {p_{{l_{i} }} } \right) = \sum\limits_{i = 0}^{l} {\frac{{n_{i} }}{n}} ,\left( {l = 0,1,2,...,L - 1} \right)$$
(1)

Here, the original image’s pixel value is indicated as \(p_{{l_{i} }}\), \(T_{f}\) is the transformation function. \(g_{{l_{i} }}^{CDF}\) signifies the transformed pixel value, the number of pixels at gray level \(g_{i}\) is denoted as \(n_{i}\) with the total number of pixels in the image implied as \(n\) and the relative frequency of each gray level in the normalized histogram is expressed by \(r_{fr}\) after the histogram equalization.

As a result, the histogram equalization enhances the lung image’s quality and its output as \(p_{i}\) (i.e., pre-processed image).

Lobe segmentation through mRRB-SegNet model

Once the image has been pre-processed, it is passed into the lobe segmentation process, which is crucial for isolating the lung lobes. Segmentation is the procedure of identifying and separating specific ROI in the pre-processed image \(p_{i}\), such as lung lobes in this case. The primary objective of lobe segmentation is to distinguish the lung regions from the background or other surrounding tissues in the image.

To perform the lung lobe segmentation, the SegNet model can be employed. SegNet is a CNN architecture27 designed for pixel-wise classification tasks like image segmentation. SegNet works by encoding the input image into feature representations via an encoder network and then decoding it back into a segmented output via a decoder network. To overcome the abovementioned limitations, the Modified Residual Recurrent-Based SegNet (mRRB-SegNet) model is introduced in this phase. This mRRB-SegNet architecture incorporates two main improvements, such as Residual connections and Recurrent layers.

Architecture of mRRB-SegNet

The mRRB-SegNet model architecture comprises two main components: the Encoder Network and Decoder Network, which are depicted in Fig. 2.

Fig. 2
Fig. 2
Full size image

Architecture of the mRRB-SegNet model.

Encoder network

The encoder network in mRRB-SegNet takes the preprocessed image \(p_{i}\) as input and processes it through a series of layers, which consist of m-RRB and max-pooling operations. The preprocessed image is first passed into the encoder, where it undergoes multiple stages of feature extraction.

  1. (i)

    m-RRB Layers

    Each m-RRB layer consists of two recurrent blocks28 as shown in Fig. 3. These blocks help capture both local features (via convolutions) and long-range dependencies (via recurrent layers). First, the input feature map is performed via a 2D convolutional layer. This process is repeated multiple times, with the number of repetitions \(t\) typically being \(t = 2\). The m-RRB layer includes three consecutive layers:

    • Convolution Layer: This layer extracts low-level features.

    • BN Layer: Normalizes the output of the convolution to stabilize training.

    • mELS-PReLU Layer: By enhancing the model’s ability to acquire complex features, this activation function plays a key role. Conventionally, PReLU is applied as given in Eq. (2).

      $$f\left( {z_{i} } \right) = \left( {\begin{array}{*{20}c} {z_{i} ;} & {if\,\,z_{i} > 0} \\ {a_{i} z_{i} ;} & {if\,\,z_{i} \le 0} \\ \end{array} } \right)$$
      (2)

    But this PReLU activation makes the model more complex, which can cause overfitting and slower training. Therefore, mELS-PReLU activation is used in this model, which is expressed in Eq. (3).

    $$f\left( {z_{i} } \right)_{new} = \left\{ {\begin{array}{*{20}c} {z_{i} ;} & {if\,\,z_{i} } \\ {\frac{{e^{f\left( z \right)} - 1}}{{1 + e^{ - f\left( z \right)} }}} & {if\,\,z_{i} \le 0} \\ \end{array} } \right.,where\,\,f\left( z \right) = \log \left( {1 + \exp^{{z_{i} }} } \right)$$
    (3)
  2. (ii)

    Max-pooling

    After each m-RRB layer, the feature maps are downsampled via max-pooling, which decreases their spatial resolution while preserving important features. During the downsampling process, pooling indices are stored and later utilized in the decoder for nonlinear upsampling. The encoder network has multiple layers stacked in a sequence (repeating m-RRB + max-pooling layers), and the pooling indices from these layers are given to the decoder network for reconstruction.

Fig. 3
Fig. 3
Full size image

Structure of the mRRB layer.

Decoder network

It takes the output from the last max-pooling operation in the encoder network and performs nonlinear upsampling to gradually reconstruct the segmented image.

  1. (i)

    Upsampling Layers

    The decoder network performs upsampling (inverse of pooling) through pooling indices from the encoder. Restoring the spatial resolution of the feature maps to match the original image dimensions is achieved through this vital upsampling process.

  2. (ii)

    m-RRB Layers in Decoder

    Similar to the encoder network, each upsampling step is followed by an m-RRB layer. The recurrent layers in the decoder help refine the segmented output by capturing long-range contextual information. The decoder network consists of multiple stages, each comprising an upsampling layer followed by an m-RRB layer.

After the decoder network processes the feature maps, the final output is a segmented image \(s_{i}\). The advantages of the mRRB-SegNet model improves segmentation accuracy by better capturing long-range dependencies. This is because the mRRB layer allows for deeper network architectures, which enables the model to learn more complex and informative features.

Feature extraction

From the segmented image \(s_{i}\), features are extracted. This phase is crucial as it derives relevant characteristics from the image to assist in further analysis. During this phase, various features are extracted, which consist of ILGP, shape features, and statistical features.

Improved local gradient pattern

Local Gradient Pattern (LGP) captures the local intensity changes within an image. It is an adaptive threshold-based feature descriptor which is primarily used to detect changes in intensity values between a center pixel and its surrounding neighbors. LGP extracts key texture features from the segmented lung image \(s_{i}\).

LGP29 operates by first examining a neighborhood of size \(P_{s} \times P_{s}\) surrounding a central pixel \(P_{ce}\). For each adjacent pixel \(P_{i}\) in this neighborhood, the gradient value \(gr_{i}\) is computed as the absolute difference \(gr_{i} = \left| {P_{i} - P_{ce} } \right|\) which reflects the local intensity variation. Next, the gradient values mean is denoted as \(gr_{am}\) is calculated through the AM formula \(gr_{am} = \frac{1}{N}\sum\nolimits_{i = 0}^{N - 1} {gr_{i} }\), here \(N\) is the number of neighboring pixels. The equation for the LGP is represented by Eqs. (4) and (5).

$$LGP_{N,ra} \left( {y_{c} ,x_{c} } \right) = \sum\limits_{i = 0}^{N - 1} {S\left( {gr_{i} - gr_{am} } \right)} 2^{i}$$
(4)
$$S\left( c \right) = \left\{ {\begin{array}{*{20}c} 1 & {if\,\,c \ge 0} \\ 0 & {otherwise} \\ \end{array} } \right.$$
(5)

From the above equation, \(ra\) indicates the radius is indicated, the coordinates of the center pixel are represented as \(\left( {y_{c} ,x_{c} } \right)\), and the variation among the threshold \(gr_{am}\) and \(gr_{i}\) of its neighboring point is implied as \(c\).

The conventional LGP may not be sufficient to capture such complex patterns, which leads to limited discriminative power. An Improved Local Gradient Pattern (ILGP) is proposed to overcome the above limitation. This ILGP enhances the standard LGP by employing a weighted root mean power (WRMP) to calculate the adaptive threshold, which is more robust to complex textures and variations in intensity. Equation (6) expresses the formula for ILGP.

$$ILGP_{N,ra} \left( {y_{c} ,x_{c} } \right) = \sum\limits_{i = 0}^{N - 1} {S\left( {gr_{i} - gr_{w - rmp} } \right)} 2^{i}$$
(6)

After the gradient values are calculated for all neighboring pixels, the WRMP of the gradient value \(gr_{w - rmp}\) is estimated via Eq. (7).

$$gr_{w - rmp} = \left[ {\frac{{\sum\limits_{i = 1}^{n} {we_{i} .n_{{p_{i} }}^{Po} } }}{{\sum\limits_{i = 1}^{n} {we_{i} } }}} \right]^{{{\raise0.7ex\hbox{$1$} \!\mathord{\left/ {\vphantom {1 {Po}}}\right.\kern-0pt} \!\lower0.7ex\hbox{${Po}$}}}}$$
(7)

Then, the weight of the gradient is evaluated by Eq. (8).

$$we\left( {n_{p} } \right) = \frac{{B * n_{p} }}{{\sqrt {C - \frac{2.A}{{B^{2} }} - n_{p}^{2} } }}$$
(8)

In Eq. (8), \(A = 1,B = 5,C = 4\), \(n_{p}\)30 is the neighboring pixel and \(P = 2\) is the power value.

For example, consider the neighbor pixel = [ 44, 12, 120, 0, 78, 35, 87, 34].

\(n_{p}\) is calculated by normalizing the neighbor pixel \(n_{p}\) = neighbor pixel /255.

\(n_{p}\) =  = [0.17254902, 0.04705882, 0.47058824, 0, 0.30588235, 0.1372549, 0.34117647,0.13333333].

\(A = 1,B = 5,C = 4\) substitute the values in Eq. (8)

\(we\left( {n_{p} } \right) =\)[0.4374,0.1188,1.2234,0.0,0.7818,0.3474,0.8746,0.3374].

When p = 2, Eq. (9), (10), and (11)

$$weighted_{ - } sum = sum(we*(n_{p} **p))$$
(9)
$$weight_{ - } sum = sum(we)$$
(10)
$$w - rmp = (weighted_{ - } sum/weight_{ - } sum)**(1/p)$$
(11)

\(w - rmp\) = 86.27.

ILGP enhances the conventional LGP by introducing a weighted root mean power to compute an adaptive threshold, which addresses the limitations of LGP in capturing complex textures. ILGP produces a smaller threshold value, which helps in detecting subtle changes in discriminative information. As a result, it can accurately identify small changes in the intensities of specific pixels, thereby increasing its discriminative power. The output from this ILGP is indicated as \(ILGP_{i}\).

Shape features

Shape features37 define the geometrical properties of the segmented image \(s_{i}\). These shape features like nodule area, nodule irregularity index, and solidity, are extracted from the segmented image \(s_{i}\).

  • Nodule area: It measures the size of the nodule in terms of pixel count in Eq. (12).

    $$A = \sum\limits_{(X,Y) \in R} 1$$
    (12)

    where the region that represents the segmented nodule is indicated by \(R\).

  • Nodule irregularity index: It measures the complexity of the nodule’s shape, often through metrics like the roundness or smoothness in Eq. (13).

    $$I = \frac{{P^{2} }}{4\pi A}$$
    (13)

    where \(A\) is the area and \(P\) is the nodule’s perimeter. A more circular form is indicatedby a number near 1, whereas irregularity is suggested by higher values.

  • Solidity: It is determined by dividing the area of the nodule by the area of its convex hull. The shape features output is signified as \(Sh_{i}\) in Eq. (14).

    $$Solidity = \frac{A}{{A_{Convexhull} }}$$
    (14)

    where \(A\) represents the nodule’s area. The part of the convex hull that surrounds the nodule is called a convex hull.

The range of solidity values is 0 to 1:

The nodule is solid and compact (near a convex shape) if the value is near 1.

A shape that is more asymmetrical or fractured is indicated by a value near 0.

Statistical features

Statistical features describe the quantifiable measures derived from the pixel intensity distribution of a segmented image \(s_{i}\).

  • Average gray level: It represents the mean intensity of pixels in the image.

  • Standard deviation: It evaluates the variation in pixel intensities.

  • Third moment: It captures the skewness or asymmetry in the intensity distribution, which can highlight irregularities.

  • Entropy: It quantifies the randomness or disorder in the pixel distribution.

  • Uniformity: It measures how evenly the pixel intensities are distributed.

Therefore, the statistical features output is signified as \(St_{i}\). The features are extracted through a segmented image in this feature extraction, and its output is denoted by \(\left( {fe_{i} = ILGP_{i} ,Sh_{i} ,St_{i} } \right)\).

Lung cancer classification via CMN-ShuffleNet Model

In the classification phase, the features extracted from the segmented image are provided as the input to the classification model that will make the final prediction (cancer or non-cancer). This phase precisely classifies the lung cancer with extracted features \(fe_{i}\).

ShuffleNet is a lightweight CNN architecture31 which is designed for efficient performance on mobile devices. It uses channel shuffle operations to improve the performance of group convolutions. ShuffleNet is applied to classify lung cancer using the extracted features \(fe_{i}\).

  1. (i)

    Group Convolutions: This reduces the number of parameters and computation required.

  2. (ii)

    Channel Shuffle: This allows the model to mix information across different channels efficiently.

However, the conventional ShuffleNet model lies in its aggressive channel reduction and reliance on group convolutions, which hinder the model’s ability to capture fine-grained features. Therefore, the CMN-ShuffleNet model is introduced and visually expressed in Fig. 4.

Fig. 4
Fig. 4
Full size image

Architecture of the CMN-ShuffleNet model.

Architecture of the CMN-ShuffleNet model

The CMN-ShuffleNet model consists of several stages, each designed to process the input features and make predictions about lung cancer presence.

Components of the CMN-ShuffleNet model

Input Layer: This is where the extracted features \(fe_{i}\) from lung images are fed into the model.

Convolutional Layer with BN & ReLU: The input features undergo convolution operations that help extract local features in the data. Next, BN is applied, which helps stabilize training by normalizing activations. Then, the non-linearity in the network is introduced via ReLU.

Max Pooling: The feature map dimensionality is reduced by this max pooling.

Stage 1

  • Input Layer: Extracted features from lung cancer images are given as input to the CMN-ShuffleNet model.

  • Conv Layer with BN and ReLU Activation: This conv layer processes the features, then applies the BN and a ReLU activation function.

  • Max Pooling: This layer reduces the spatial dimensions of the feature maps, also itt retains important features.

  • CMN Layer: A custom normalization technique is applied to the features, which improves the model’s ability to focus on more relevant information, which is explained by the mathematical expression as given in Eq. (15).

    $$\hat{y}_{ij} = \frac{{y_{ij} - \mu_{j} }}{{\sqrt {\nu_{{j_{mn} }}^{2} + \varpi } }}$$
    (15)

In Eq. (12), \(\mu_{j}\) denotes the average of each column, \(\nu_{{j_{mn} }}^{2}\) signifies the modified mean normalized variance, and \(\varpi\) indicates the smallest positive constant value. The formula for the \(\mu_{j}\) and \(\nu_{{j_{mn} }}^{2}\) is expressed in Eqs. (16) and (17).

$$\mu_{j} = \frac{1}{m}\sum\limits_{i = 1}^{m} {y_{ij} }$$
(16)
$$\nu_{{j_{nm} }} = \frac{{y_{ij} - \mu_{j} }}{{\max \left( {y_{ij} } \right) - \min \left( {y_{ij} } \right)}}$$
(17)

Here, \(j\) implies the number of mini-batches and \(\hat{y}_{ij}\) represents the normalized matrix.

CBAM enhances feature representations by sequentially applying two attention mechanisms: channel attention and spatial attention32.

CBAM Process

It takes an intermediate feature map \(Fe \in {\mathbb{R}}^{C \times H \times W}\) as input and generates a refined output \(Fe^{^{\prime\prime}}\) through the following process:

A channel attention \(Mp_{ch} \in {\mathbb{R}}^{C \times 1 \times 1}\) map is applied to \(Fe^{^{\prime\prime}}\) via element-wise multiplication as given in Eq. (18).

$$Fe{\prime} = Mp_{ch} \left( {Fe} \right) \otimes Fe$$
(18)

A spatial attention map \(Mp_{sp} \in {\mathbb{R}}^{1 \times H \times W}\) is then computed and it is applied to \(Fe^{^{\prime\prime}}\) is given in Eq. (19).

$$Fe^{^{\prime\prime}} = Mp_{sp} \left( {Fe{\prime} } \right) \otimes F{\prime} e$$
(19)
  1. (1)

    Channel Attention: The channel attention mechanism captures inter-channel dependencies as depicted in Fig. 5. Each channel acts as a detector, and its relevance is computed by aggregating spatial information. Both average pooling and max pooling are applied independently which resulting in two descriptors: \(Fe_{avg}^{ch}\) and \(Fe_{\max }^{ch}\). These are passed through a shared MLP with one hidden layer to generate the attention map as expressed in Eq. (20) and Eq. (21).

    $$Mp_{ch} \left( {Fe} \right) = \vartheta \left( {MLP\left( {avgpool\left( {Fe} \right)} \right) + MLP\left( {\max pool\left( {Fe} \right)} \right)} \right)$$
    (20)
    $$Mp_{ch} \left( {Fe} \right) = \vartheta \left( {We_{1} \left( {We_{0} \left( {Fe_{avg}^{ch} } \right)} \right) + We_{1} \left( {We_{0} \left( {Fe_{\max }^{ch} } \right)} \right)} \right)$$
    (21)

    Here, the sigmoid function is denoted as \(\vartheta\). The shared MLP has weights \(We_{0} \in {\mathbb{R}}^{C/r \times C}\) and \(We_{1} \in {\mathbb{R}}^{C/r \times C}\), a ReLU activation after \(We_{0}\). This design uses both pooling operations to effectively capture channel-specific importance.

  2. (2)

    Spatial Attention: Spatial attention identifies informative spatial regions by capturing inter-spatial relationships as shown in Fig. 6. Average pooling and max pooling are applied along the channel dimension to generate two 2D maps \(Fe_{avg}^{sp}\) and \(Fe_{\max }^{sp}\). A spatial attention map is computed using a 7 × 7 kernel in a convolution layer, as shown in Eqs. (22) and (23), after concatenating these.

    $$Mp_{sp} \left( {Fe} \right) = \vartheta \left( {f_{c}^{7 \times 7} \left( {\left| {avgpool\left( {Fe} \right);\max pool\left( {Fe} \right)} \right|} \right)} \right)$$
    (22)
    $$Mp_{sp} \left( {Fe} \right) = \vartheta \left( {f_{c}^{7 \times 7} \left( {\left| {Fe_{avg}^{sp} ;Fe_{\max }^{sp} } \right|} \right)} \right)$$
    (23)
Fig. 5
Fig. 5
Full size image

Structure of the Channel Attention module.

Fig. 6
Fig. 6
Full size image

Structure of the Spatial Attention module.

Here, the convolution operation is implied as \(f_{c}^{7 \times 7}\) with a filter size of 7 × 7.

CBAM module enhances ShuffleNet by integrating channel and spatial attention mechanisms, which enables the model to concentrate on important features and improve classification performance. Additionally, the CBAM module dynamically adjusts its attention in response to the features, allowing the network to prioritize patterns specific to lung cancer.

The Adam optimizer is essential to the suggested model’s successful training of the Modified ShuffleNet architecture, which combines shape-based and gradient pattern features for the classification of lung cancer. Adam offers variable learning rates for every parameter and uses momentum to speed up convergence, combining the best features of RMSProp and AdaGrad. By managing the intricate feature space produced by the CMN and BN layers, this optimization technique ensures steady and effective learning across the network. Faster convergence and improved generalization are crucial when training deep neural networks on medical imaging data, which frequently contains sparse and diverse samples. Adam helps achieve these goals. The end-to-end system may achieve high classification accuracy and robustness in lung cancer detection because to its integration, which enhances the segmentation capabilities offered by the upgraded M-SegNet.

Advantages of the modified ShuffleNet

  • Combining the Modified ShuffleNet with the Convolutional Block Attention Module (CBAM) and Custom Mean Normalization (CMN) provides a very effective and precise way to classify lung cancer.

  • By normalizing features in a way that improves convergence and lowers training instability, CMN guarantees a constant input distribution, which is crucial when working with complicated medical data.

  • By using channel and spatial attention, CBAM enhances the model even more, allowing it to concentrate on the most pertinent areas and characteristics in lung images, which is essential for spotting small cancers.

  • Modified ShuffleNet’s lightweight design, on the other hand, preserves strong computational efficiency, which qualifies the model for deployment in low-resource settings and real-time applications.

  • In comparison to conventional learning models, this combination achieves improved classification performance by striking a potent balance between speed, precision, and discriminative capability.

Global Pooling: It is applied after all stages, which reduces the size of each feature map to a single value.

FC Layer: Pooled features are given to the FC layer, where the model learns to map the extracted features to the final class labels.

Output Layer: The output layer applies an activation function, which gives the final prediction.

The output from this CMN-ShuffleNet model classifies the output as ‘0’ (non-cancer) or ‘1’ (cancer). Hence, the CMN-ShuffleNet model output is denoted as \(CMN - ShuffleNet_{i}\). Table 2 shows the hyperparameters of the classifier.

Table 2 Hyperparameters of classifier.

Table 3 shows the architectural difference of the existing and the proposed model.

Table 3 Architectural differences of existing and proposed model.

Advantages of the proposed model

  • For the classification of lung cancer, the Modified ShuffleNet trained on Gradient Pattern and Shape-based Features with Improved M-SegNet Segmentation provides a number of significant benefits.

  • First, by using recurrent residual blocks and sophisticated activation functions to capture both local and long-range dependencies, Improved M-SegNet (mRRB-SegNet) improves segmentation accuracy and isolates lung areas more precisely.

  • The model can extract highly discriminative information from segmented images by integrating Improved Local Gradient Pattern (ILGP) with shape and statistical data. This enhances the algorithm’s capacity to identify small alterations associated with malignancy.

  • The lightweight and effective ShuffleNet design guarantees quick inference without sacrificing accuracy, while the Custom Mean Normalization (CMN) enhances input consistency.

  • The model’s focus on important features is further enhanced by adding attention mechanisms like CBAM.

Results and discussion

Simulation procedure

The proposed Lung Cancer Classification was implemented using PYHTON, precisely “Version 3.7”. The processor utilized was “AMD Ryzen 5 3450U with Radeon Vega Mobile Gfx 2.10 GHz, and the Installed RAM size was 16.0 GB”.

Dataset description

The investigation and validation in this study are supported by the use of two important datasets. As the main source for the whole analytical process, including model construction and performance evaluation, Dataset 1 consists of the LUNA16 dataset33, and Dataset 2 consists of the LIDC-IDRI34 database. The original LIDC-IDRI dataset, which is used exclusively for cross-validation to guarantee the results’ robustness and generalizability, makes up Dataset 2. The study preserves uniformity in data characteristics while boosting the dependability of its conclusions by utilizing LUNA16 for thorough analysis and LIDC-IDRI for validation.

Dataset 1 description

The lung cancer classification was analyzed using LUNA1633. This dataset includes totally 888 CT scans. In this research, we have collected a data from 445 patients, it consists 306 cancer patients and 139 non-cancer patients. Two diverse classes are used, such as Non-Cancer (label 0) and Cancer (label 1). The 2,336 figure most likely reflects a split that occurred during experimentation with 1112 going to one category label 0 (Non-cancer), and 1224 going to another category label 1 (cancer), for a total of 2,336 entries. The sum of 1112 + 1224 = 2,336 is exclusive of the initial 888 scans.

Given that the dataset is reasonably balanced, with class 0 including 1112 samples and class 1 comprising 1224 samples, the problem of class imbalance seems to be negligible. Because of this tiny discrepancy, no sophisticated class balancing strategies like undersampling, oversampling, or the creation of synthetic data (like SMOTE) were needed. The model most likely contributed to dependable classification performance by successfully learning features from both classes without introducing appreciable bias. Table 4 shows the training and testing images. For training (60%), the training images used are 1401, and the testing images used is 935. For training (70%), the training images used is 1635, and the testing images is 701. For training (80%), the training image used is 1868, and the testing image is 468. For training (90%), the training image used is 2102, and the testing image is 234.

Table 4 Training and testing image.

Dataset 2 description (Dataset used for cross validation)

The LIDC-IDRI dataset was evaluated from34. The Lung Image Database Consortium image collection (LIDC-IDRI) includes thoracic computed tomography (CT) scans with marked-up annotated lesions for lung cancer screening and diagnosis. This global resource is available online for the development, instruction, and assessment of computer-assisted diagnostic (CAD) techniques for the identification and diagnosis of lung cancer. The success of a consortium built on a consensus-based process is demonstrated by this public–private partnership, which was started by the National Cancer Institute (NCI), advanced by the Foundation for the National Institutes of Health (FNIH), and actively participated in by the Food and Drug Administration (FDA).

Performance analysis

A comprehensive estimation was presented to evaluate the proposed Lung Cancer Classification method in contradiction to established methodologies. The evaluation compared the CMN-ShuffleNet approach with state-of-the-art techniques like modified VGG1635, DNN 14, AlexNet-SVM18 and ATT-DenseNet16, as well as conventional classifiers such as Gated Recurrent Unit (GRU), LeNet, Bidirectional Long Short-Term Memory (Bi-LSTM), LinkNet and ShuffleNet. We have implemented and compared the state-of-the-art-models and the traditional methods for the LUNA16 dataset. By using the source code, the results are implemented and analyzed. LIDC-IDRI is used for the cross-validation purpose. This extensive examination exploited many performance measures, including “Sensitivity, Negative Predictive Value (NPV), Specificity, F-measure, False Negative Rate (FNR), Precision, False Positive Rate (FPR), Matthews Correlation Coefficient (MCC), and Accuracy” to exhaustively inspect the performance of the CMN-ShuffleNet method. Further, the original images and HE utilizing pre-processed images are displayed in Fig. 7. This procedure is employed to enhance the quality of lung images.

Fig. 7
Fig. 7
Full size image

Pre-processed outcomes (a) Sample Images and (b) HE utilizing pre-processed images.

Segmentation analysis

Segmentation Accuracy: A metric used to determine how closely a measurement’s output matches the expected value or standard is called an accuracy measure. It measures the accurateness of a classification scheme, which is expressed in Eq. (24).

$${\text{Segmentation}}\,{\text{Accuracy}} = \frac{{T_{N} + T_{p} }}{{F_{N} + F_{P} + T_{N} + T_{P} }}$$
(24)

Figure 8 presents sample images alongside their respective segmented findings for BIRCH, Conventional SegNet, U-Net and mRRB-SegNet for lobe segmentation. In this context, the mRRB-SegNet offered exceptional segmented outcomes compared to conventional approaches. This employment of the mRRB layer and the mELS-PReLU activation function in the mRRB-SegNet has the capability to capture long-range dependencies and progress segmentation accuracy.

Fig. 8
Fig. 8
Full size image

Segmented Results (a) Sample Images (b) BIRCH (c) Conventional SegNet (d) U-Net and (e) mRRB-SegNet.

Analysis of gradient pattern

Figure 9 shows the input image and ILGP images, the gradient pattern is discussed in terms of how it can improve feature extraction for the classification of lung cancer. Gradient patterns are essential for detecting minute texture alterations in medical images because they capture local intensity variations and directional changes in pixel values. As previously mentioned, the Improved Local Gradient Pattern (ILGP) approach improves this procedure by focusing on steady directional gradients and reducing noise, producing features that are more resilient and discriminative. Since textural anomalies are frequently subtle but diagnostically significant, this is especially helpful in differentiating between malignant and benign tissues.

Fig. 9
Fig. 9
Full size image

Gradient Pattern Analysis for (a) Input image (b) ILGP.

Analysis on dice, jaccard and segmentation accuracy

Jaccard Index

Calculates the overlap between the ground truth and the projected segmentation in Eq. (25).

$${\text{Jaccard}}\,{\text{Index}} = \frac{{\left| {A \cap B} \right|}}{{\left| {A \cup B} \right|}}$$
(25)

\(A =\) Set of pixels in the ground truth.

\(B =\) Set of pixels in the predicted segmentation.

Dice coefficient

It is a harmonic mean of recall and precision in Eq. (26).

$${\text{Dice}}\,{\text{ coefficient}} = 2\frac{{\left| {A \cap B} \right|}}{\left| A \right| + \left| B \right|}$$
(26)

Table 5 explains the comparative segmentation assessment of mRRB-SegNet to existing segmentation methods, including BIRCH, U-Net and Conventional SegNet for lobe segmentation. While analyzing the segmentation accuracy, the mRRB-SegNet attained the highest score of 0.926, though the traditional approaches recorded lesser values ranging from 0.754 to 0.799. Additionally, the mRRB-SegNet demonstrated a higher Dice rate of 0.911 and Jaccard score of 0.903, whereas the BIRCH, U-Net and Conventional SegNet acquired lesser values. The m-RRB layer in the mRRB-SegNet approach allows for deeper network architectures, and it allows the network to study more complex information features.

Table 5 Dice, Jaccard and Segmentation Accuracy analysis on mRRB-SegNet and Existing Segmentation Methods.

Analysis of feature comparison

A thorough feature comparison analysis of the various approaches used in the lung cancer classification model is shown in Table 6. These approaches include those that only use improved Local Gradient Pattern (LGP), only use shape-based features, do not extract features, and the suggested method that combines both feature types. In every assessed parameter, the suggested approach performs noticeably better than the others. It maintains the lowest false positive rate (FPR: 4.50%) and false negative rate (FNR: 6.10%) while achieving the maximum accuracy (94.66%), sensitivity (93.90%), specificity (95.50%), precision (95.85%), F-measure (94.87%), MCC (0.8932), and NPV (93.39%). On the other hand, performance is decreased when LGP or shape-based features are used only, which is especially noticeable in sensitivity and MCC, indicating that these features are not enough for the best discrimination.

Table 6 Feature comparison analysis.

Comparative analysis

The comparative assessment of the CMN-ShuffleNet approach for lung cancer classification is systematically performed against existing strategies, such as GRU, LeNet, Bi-LSTM, LinkNet, ShuffleNet, modified AlexNet-SVM18 and ATT-DenseNet16. The assessment emphasises on wide-ranging set of performance measures, incorporating Positive, Negative and Neutral measures, which are exposed in Figs. 10, 11 and 12. For an effective lung cancer classification system, the model should attain greater values in positive and neutral measures while sustaining lower values in negative measures. At 60% training data, the CMN-ShuffleNet achieved an accuracy of 0.905, which is notably higher than existing approaches. As the training data increased to 70%, 80% and 90%, the CMN-ShuffleNet continued its lead with accuracies of 0.912, 0.947 and 0.958, respectively. In comparison, the conventional methods like GRU, LeNet, Bi-LSTM, LinkNet, ShuffleNet, modified AlexNet-SVM18, and ATT-DenseNet16 exposed expanding accuracy but did not outperform the CMN-ShuffleNet method’s performance. By 90% training data, the CMN-ShuffleNet approach reached a sensitivity of 0.957, signifying that the CMN-ShuffleNet approach is effective in categorizing lung cancer. In contrast, the conventional methods recorded lesser sensitivity values with GRU at 0.876, LeNet at 0.884, Bi-LSTM at 0.843, LinkNet at 0.901, ShuffleNet at 0.893, modified AlexNet-SVM18 at 0.893 and ATT-DenseNet16 at 0.917, respectively. The ILGP have the ability to accurately identify the small changes in intensities of particular pixels and also increase the discriminative power.

Fig. 10
Fig. 10
Full size image

Positive metric assessment on CMN-ShuffleNet versus existing methods (i) Accuracy (ii) Precision (iii) Sensitivity and (iv) Specificity.

Fig. 11
Fig. 11
Full size image

Negative metric assessment on CMN-ShuffleNet versus existing methods (i) FNR and (ii) FPR.

Fig. 12
Fig. 12
Full size image

Neutral metric assessment on CMN-ShuffleNet versus existing methods (i) F-measure (ii) MCC and (iii) NPV.

The CMN-ShuffleNet accomplishes the supreme NPV of 0.955 in 90% training data, exhibiting its excellent classification performance. In comparison, Bi-LSTM, GRU, LeNet and modified AlexNet-SVM18 acquired the NPVs of 0.836, 0.867, 0.874 and 0.945. These values signify that these models perform well, they do not match the CMN-ShuffleNet model’s capability to precisely classify the lung cancer. ShuffleNet, LinkNet and ATT-DenseNet16 have NPVs of 0.891, 0.896 and 0.940, which also rank below the CMN-ShuffleNet approach. More particularly, the CMN-ShuffleNet model attained an FNR of 0.061 in 80% of the training data, which stands out significantly lower than other methods. While GRU, LeNet, Bi-LSTM, LinkNet, ShuffleNet, modified AlexNet-SVM18, and ATT-DenseNet16 obtained higher FNR values of 0.187, 0.142, 0.167, 0.122, 0.134, 0.125 and 0.104, respectively. Initially, an enhanced M-SegNet (Modified SegNet) model is employed to reduce false positives and increase border detection for accurate lung area segmentation. The computationally efficient Modified ShuffleNet is then trained on both raw image data and gradient patterns and shape-based features to capture more discriminative information. While maintaining the model’s lightweight and adaptability for use in clinical settings with limited resources, this hybrid feature integration enhances classification performance. The mean normalization layer and a convolutional block attention module introduced channel and spatial attention mechanisms, which significantly advanced the classification performance.

Statistical assessment in terms of accuracy

To assess the efficacy of the CMN-ShuffleNet strategy, a statistical analysis is conducted comparing it with existing approaches, such as GRU, LeNet, Bi-LSTM, LinkNet, ShuffleNet, modified AlexNet-SVM18, and ATT-DenseNet16 for Lung cancer classification, are illustrated in Fig. 13. The CMN-ShuffleNet approach attains the greatest accuracy of 0.958 in maximum statistical measure, suggesting its expected ability to reach the greatest level of performance. This score surpasses that of GRU (0.872), Bi-LSTM (0.850), LinkNet (0.906), ShuffleNet (0.915), modified AlexNet-SVM18 (0.950) and ATT-DenseNet16 (0.945), respectively. ILGP produces a smaller threshold value, which helps to capture the changes in discriminative information. For the Mean statistical metric, the CMN-ShuffleNet approach attained the highest accuracy rate of 0.930, whereas GRU, LeNet, Bi-LSTM, LinkNet, ShuffleNet, modified AlexNet-SVM18, and ATT-DenseNet16 accomplished lesser accuracy values. The CBAM module dynamically adjusts its attention based on the features, allowing the network to prioritize regions specific to lung cancer patterns. This leads the CMN-ShuffleNet model to better categorization of cancerous regions.

Fig. 13
Fig. 13
Full size image

Statistical evaluation in terms of accuracy.

Comparison with the state-of-the-art models

Figure 14 shows that the suggested CMN-ShuffleNet model outperforms the baseline models such as VGG16, DNN, AlexNet-SVM, and ATT-DenseNet in the categorization of lung cancer. With the best overall classification efficacy, CMN-ShuffleNet attains the maximum accuracy (94.66%), sensitivity (93.90%), specificity (95.50%), precision (95.85%), and F-measure (94.87%). It also shows the strongest correlation between the actual and predicted classes, with the greatest Matthews Correlation Coefficient (MCC) of 0.8932. To further bolster its dependability, CMN-ShuffleNet also maintains the lowest false positive rate (FPR = 0.045) and a competitively low false negative rate (FNR = 0.061). CMN-ShuffleNet effectively balances specificity and sensitivity, avoiding bias toward either class, in contrast to VGG16 and AlexNet-SVM, which exhibit somewhat lower specificity and sensitivity. With these enhancements, CMN-ShuffleNet is a strong option for precise and effective lung cancer diagnosis, demonstrating the value of combining gradient and shape characteristics with a lightweight yet expressive network structure.

Fig. 14
Fig. 14
Full size image

Analysis of State-of-the-art Comparison Models.

Analysis on individual features

The performance of several feature extraction techniques like Statistical features, shape-based features, conventional local gradient pattern (LGP), and the suggested combined approach within the Modified ShuffleNet framework trained on gradient pattern and shape-based features for lung cancer classification using enhanced M-SegNet segmentation is compared in Fig. 15. With the best accuracy (0.9466), sensitivity (0.9390), specificity (0.9550), precision (0.9585), F-measure (0.9487), and the strongest Matthews Correlation Coefficient (MCC: 0.8932), the suggested approach performs noticeably better than any of the individual feature sets. In addition, it produces the lowest false positive rate (FPR: 0.0450) and false negative rate (FNR: 0.0610), as well as the best negative predictive value (NPV: 0.9339), suggesting a more accurate and balanced categorization. Although statistical features outperform shape-based and LGP features separately, none of the single-feature methods can match the suggested combined feature strategy’s resilience and discriminative capability. This highlights how combining complementing characteristics might improve classification accuracy by capturing both structural and textural subtleties in lung cancer imaging.

Fig. 15
Fig. 15
Full size image

Individual feature analysis.

Ablation study

To evaluate the impact of various components on the performance of the CMN-ShuffleNet approach, an ablation study was conducted. This assessment examined the efficiency of distinct configurations, including CMN-ShuffleNet without preprocessing, CMN-ShuffleNet with existing LGP, CMN-ShuffleNet with existing SegNet, CMN-ShuffleNet with shape-based feature only and CMN-ShuffleNet with statistical feature only, as exposed in Table 7. The CMN-ShuffleNet with existing LGP and existing SegNet accomplished F-measure of 0.901 and 0.889, while the CMN-ShuffleNet with shape-based feature only and CMN-ShuffleNet without pre-processing recorded the F-measure values of 0.889 and 0.890. The CMN-ShuffleNet with statistical features only recorded the F-measure rate of 0.915, showcasing improved performance in lung cancer classification. However, the CMN-ShuffleNet approach demonstrates a higher F-measure of 0.949, revealing its superior performance in accurately categorizing lung cancer. The minimum FPR achieved using the CMN-ShuffleNet is 0.045, exhibiting reduced error values. Conversely, CMN-ShuffleNet without preprocessing, CMN-ShuffleNet with existing LGP, CMN-ShuffleNet with existing SegNet, CMN-ShuffleNet with shape-based feature only and CMN-ShuffleNet with statistical feature only recorded greater FPR values.

Table 7 Ablation Evaluation on CMN-ShuffleNet Strategy, CMN-ShuffleNet without preprocessing, CMN-ShuffleNet with existing LGP, CMN-ShuffleNet with existing SegNet, CMN-ShuffleNet with shape-based feature only and CMN-ShuffleNet with statistical feature only.

Analysis of three-stage feature dimension

The three-stage feature dimension analysis is shown in Table 8. 224 × 224 × 3 images, which are typical of medical imaging standards, are processed by the input layer. Early feature extraction is shown by the first convolutional layer, which doubles the channel depth to 24 and cuts the spatial dimensions in half to 112 × 112. In order to standardize activations and introduce non-linearity without changing dimensions, batch normalization and ReLU activation emerge next. In order to efficiently summarize feature maps while maintaining the channel depth, a max pooling layer further decreases spatial dimensions to 56 × 56. The lightweight yet discriminative design of the modified ShuffleNet, which is tuned for extracting crucial texture and shape signals related to lung cancer detection from segmented regions generated by the improved M-SegNet, is reflected in its small and effective feature dimension progression.

Table 8 Three-stage feature dimension.

Ablation study based on the improved segmentation

Using enhanced M-SegNet segmentation, Table 9 shows an ablation study assessing several activation functions and the effect of eliminating the Residual Refinement Block (RRB) in the suggested Modified ShuffleNet trained on gradient pattern and shape-based features for lung cancer classification. The conventional softplus function performs the best overall among the tested activation functions, obtaining the highest accuracy (0.935), sensitivity (0.934), specificity (0.936), precision (0.940), and F-measure (0.937). It also maintains lower false positive (FPR: 0.064) and false negative (FNR: 0.066) rates.On the other hand, traditional ReLU exhibits the lowest accuracy (0.917) and MCC (0.859), among other metrics. Although it still falls short of softplus, the parametric ReLU (PReLU) outperforms ReLU, exhibiting modest gains in all metrics. Softplus significantly improves model performance in this lung cancer classification framework, highlighting the importance of activation function selection.

Table 9 Analysis of ablation study based on the activation functions.

Analysis of cross-validation

Cross-validation analysis provides a strong framework for evaluating the model’s generalization ability across various data subsets as shown in Table 10. The dataset 1 is used for training, and the dataset 2 is used for training. The LIDC-IDRI dataset is used for cross-validation purpose. The Cross-validation makes sure that the reported performance measures, such an accuracy of 94.79%, sensitivity of 94.68%, and specificity of 94.94%, are not the consequence of overfitting to a specific subset by repeatedly training and testing the model on various dataset partitions. Strong predictive ability and a balance between false positives and false negatives are indicated by the high precision (95.07%) and F-measure (94.87%) values. The model’s dependable performance in binary classification, which takes into consideration both true and erroneous predictions, is further demonstrated by its Matthews Correlation Coefficient (MCC) of 0.9066. Further demonstrating the model’s resilience in differentiating between malignant and non-malignant cases, a crucial diagnostic function in medicine are the low false positive rate (FPR = 5.06%) and false negative rate (FNR = 5.32%). This investigation confirms the efficacy of the suggested deep learning pipeline, which combines sophisticated feature extraction and segmentation techniques for precise lung cancer identification.

Table 10 Cross-validation analysis.

Analysis of K-fold validation

A reliable cross-validation method for evaluating the performance and generalizability of machine learning models is K-fold validation, which is especially useful in medical imaging where data variability is significant. The consistency and dependability of several deep learning models, including the suggested Modified ShuffleNet, trained on gradient pattern and shape-based features taken from lung CT images were assessed in this study using K-fold validation is shown in Table 11. Each model was trained and validated k times using the dataset’s k subsets (folds), with each iteration using a new fold for testing and the remaining folds for training. This method reduces volatility and bias in performance estimation.When combined with improved M-SegNet segmentation and thoughtfully designed input features, the Modified ShuffleNet continuously outperformed conventional architectures like GRU, LeNet, and Bi-LSTM across all five folds shown in Table 11. Using the enhanced M-SegNet segmentation and gradient pattern and shape-based feature training, the suggested CMN-ShuffleNet consistently beat all other models throughout the five folds, attaining the greatest accuracy in each instance (range from 0.9379 to 0.9503). This improved performance demonstrates the value of combining feature-rich representations with a network architecture that is both lightweight and strong. The superior robustness and precision of CMN-ShuffleNet validate its applicability for accurate lung cancer diagnosis.

Table 11 K-fold validation analysis.

Analysis of the Friedman test

The Friedman Test Analysis shown in Table 12 assesses the statistical significance of performance differences among different deep learning architectures. The Friedman test’s p-values show whether observed performance differences are statistically significant or the result of chance. In contrast to CMN-ShuffleNet, models such as ShuffleNet (p = 0.014) and ATT-DenseNet (p = 0.043) exhibit notable differences, indicating that CMN-ShuffleNet offers better or noticeably different performance. However, models like LeNet (p = 0.089) and BiLSTM (p = 0.176) exhibit less statistical differentiation, suggesting similar performance. Overall, the Friedman test demonstrates that CMN-ShuffleNet’s combination of gradient and shape data with M-SegNet’s enhanced segmentation results in statistically significant improvements over a number of conventional models for classifying lung cancer.

Table 12 Friedman test analysis.

Analysis of Wilcoxon test

The Wilcoxon test analysis, which highlights the statistical significance of performance differences, compares CMN-ShuffleNet to other models in Table 13. A non-parametric statistical test called the Wilcoxon test is used to compare matched samples and determine whether there are differences in their population mean rankings. In the context of the modified ShuffleNet trained on gradient pattern and shape-based features for the classification of lung cancer, the Wilcoxon test was utilized to assess the statistical significance of the performance differences between CMN-ShuffleNet and other models. Significantly, CMN-ShuffleNet outperforms GRU (p = 0.004), AlexNet-SVM (p = 0.004), LeNet (p = 0.047), and ATT-DenseNet (p = 0.035) in terms of statistical significance, as seen by their p-values falling below the traditional 0.05 cutoff. This implies that when trained on gradient pattern and shape-based data, the suggested model performs noticeably better than these architectures in lung cancer classification tests. Comparable performance was shown by the comparisons with BiLSTM (p = 0.160), LinkNet (p = 0.190), and Shufflenet (p = 0.138), which did not provide statistically significant differences.

Table 13 Analysis of the Wilcoxon test.

Analysis of T-test value

The suggested CMN-ShuffleNet model’s performance is compared to other comparison models using a statistical technique called a T-test is analyzed in Table 14. The T-test specifically assesses whether the variations in classification accuracy (or other performance measures) between each baseline model and CMN-ShuffleNet are statistically significant or merely the result of chance. The study evaluates whether the enhancements made by the updated ShuffleNet are consistently superior to those of rival models by computing the T-test value, confirming the efficacy of combining gradient pattern and shape-based features in the classification of lung cancer. Other models that displayed a moderate statistical difference, suggesting some degree of performance variation, included LeNet (0.108446), LinkNet (0.194792), ShuffleNet (0.164762), and ATT-DenseNet (0.160142). Conversely, AlexNet-SVM (0.264762) and GRU (0.601421) exhibited comparatively higher T-test values, indicating less significant differences. Overall, these findings demonstrate CMN-ShuffleNet’s efficacy, when compared to the other existing models.

Table 14 T-test analysis.

Analysis of P-test value

The P-test analysis contrasts different deep learning models with the CMN-ShuffleNet in Table 15. A statistical technique called a P-test, or probability test, is used to assess if performance differences between two models are statistically significant or more likely to be the result of chance. While a lower P-value (usually less than 0.05) denotes a substantial difference in classification performance, a higher P-value (near to 1) shows the models perform equally.High P-values, on the other hand, indicate statistically equivalent performance to CMN-ShuffleNet and suggest that models like BiLSTM (0.972), LeNet (0.891), and LinkNet (0.805) may capture similar feature representations or have equal classification efficacy. Although they exhibit greater variability, other models such as AlexNet-SVM (0.635), ShuffleNet (0.735), and ATT-DenseNet (0.739) also exhibit moderate resemblance. These findings demonstrate the competitive standing of CMN-ShuffleNet among existing models.

Table 15 P-test analysis.

Validation of pre-processed image using histogram equalization

The evaluation of pre-processed lung CT images following histogram equalization yields a range of findings using the BRISQUE, NIQE, and PIQE for with and without pre-processing in Fig. 16. Higher quality scores (1.5 to 1.9) in comparison to unprocessed data (scores range from 0.5 to 1.0) show that pre-processing generally improves classification accuracy across the board. Particularly in the context of medical imaging for lung cancer detection, our results emphasize the significance of pre-processing in improving the visual quality of input images, which in turn allows better feature extraction and more precise segmentation and classification. A perceptual quality that is generally excellent after augmentation is indicated by BRISQUE scores between 1.50 and 1.75, but NIQE values between 1.60 and 1.70 indicate a modest decline in naturalness that is model-agnostic but still falls within acceptable quality ranges. Notably, localized regions exhibit minimal perceptual distortion, as seen by the lowest PIQE score of roughly 0.60. Improved segmentation and diagnostic accuracy are made possible by this preprocessing step, which probably helps the Gradient Pattern and Shape-based Modified ShuffleNet pipeline train more robustly.

Fig. 16
Fig. 16
Full size image

Validation of Pre-processed image.

Parametric analysis based on varying learning rate

When the network is trained using gradient pattern and shape-based data taken from lung CT scans, the term "parametric analysis-based learning rate in a modified ShuffleNet" refers to methodically adjusting the learning rate to maximize the training procedure. By enhancing feature representation and stabilizing convergence, this method, when paired with better M-SegNet segmentation, improves the accuracy of lung cancer classification. The performance of the suggested model is shown in Table 16 as a function of varying learning rates. With the highest accuracy (0.946), sensitivity (0.939), specificity (0.954), precision (0.958), F-measure (0.948), and MCC (0.893), as well as the lowest false positive rate (0.045) and false negative rate (0.060), the analysis shows that the learning rate of 0.001 produces the best results . The suggested method’s resilience in managing intricate medical imaging tasks is demonstrated by the steady improvements observed across important assessment criteria, which makes it ideal for reliable and accurate lung cancer diagnosis.

Table 16 Parametric analysis based varying learning rate.

Analysis of training and testing accuracy

There are slight performance reductions from training to testing, indicating high generalization in the training and testing findings. High accuracy (0.9908 train, 0.9584 test) and F1-score (0.9917 train, 0.9593 test) are attained by the model, suggesting steady and equitable performance across classes is shown in Table 17. Strong correlation between the predicted and actual classifications, despite minor variance, is indicated by a Matthews Correlation Coefficient (MCC) of 0.9476 (train) and 0.9167 (test). The robustness of the classifier is further supported by a low false positive rate (FPR) and false negative rate (FNR) in both training (0.0077, 0.0103) and testing (0.0401, 0.0426). These findings collectively imply that the suggested method successfully strikes a balance between sensitivity and specificity, providing a dependable and precise option for automated lung cancer diagnosis.

Table 17 Analysis of training and testing accuracy.

Analysis of training and validation loss

For the classification of lung cancer, the modified ShuffleNet model trained on gradient pattern and shape-based features shows good learning and generalization, as demonstrated by the analysis of training and validation loss in Fig. 17. Both training and validation losses usually show a consistent drop in the loss curves, signifying successful convergence. Better segmentation quality is achieved by integrating upgraded M-SegNet, which also helps to reduce overfitting and validation loss. Both training and validation losses initially begin above 0.12 and then gradually decrease, indicating that the model is learning and convergent. Strong generalization performance and little overfitting are indicated by the training loss approaching 0.09 by epoch 50 and the validation loss stabilizing slightly above it. The two curves’ strong alignment during training indicates that the model is well-supported by the hybrid feature set and precise segmentation offered by M-SegNet, which improves its capacity to identify lung cancer patterns.

Fig. 17
Fig. 17
Full size image

Training and validation loss analysis.

Analysis of ROC

The Receiver Operating Characteristic (ROC) curve plots the true positive rate against the false positive rate at different thresholds to assess the model’s capacity to differentiate between malignant and non-cancerous instances. It offers a thorough understanding of categorization performance, with the system’s overall diagnostic accuracy indicated by the area under the ROC curve (AUC). The efficiency of the Modified ShuffleNet architecture trained on gradient pattern and shape-based features, in conjunction with enhanced M-SegNet segmentation, is highlighted in Fig. 18, which presents the ROC curve analysis comparing many deep learning models for lung cancer classification. The most robust classification performance was demonstrated by CMN-ShuffleNet, which had the greatest AUC of 0.95 among the models studied. The significance of combining context-aware modules with improved feature extraction approaches is highlighted by the useful AUC value of CMN-ShuffleNet, especially in challenging medical imaging applications like lung cancer detection.

Fig. 18
Fig. 18
Full size image

ROC curve analysis.

Analysis of confusion matrix

A confusion matrix compares actual labels with expected labels to provide a summary of the prediction results in classification tasks. In the context of the enhanced M-SegNet for segmentation and the Modified ShuffleNet trained on gradient pattern and shape-based features for lung cancer classification, the confusion matrix aids in evaluating the model’s ability to distinguish between malignant and non-cancerous classes is shown in Fig. 19. The confusion matrix reveals great classification performance, with the model correctly recognizing 108 non-cancerous instances (46.15%) and 115 malignant cases (49.15%), demonstrating high accuracy across both groups. Only a minor number of misclassifications occurred, with 6 non-cancerous samples (2.56%) mistakenly forecasted as malignant and 5 cancerous samples (2.14%) misclassified as non-cancerous. This balanced distribution of correct predictions suggests that the model is not biased toward either class and is capable of recognizing small variations between cancerous and non-cancerous CT slices. The low frequency of false positives and false negatives further underlines the model’s robustness, making it suitable for clinical decision support where both sensitivity (detecting cancerous instances) and specificity (properly recognizing non-cancerous cases) are crucial.

Fig. 19
Fig. 19
Full size image

Confusion matrix analysis of the proposed model.

Analysis of computational time

The computational time of several deep learning models utilized in the classification of lung cancer is compared in Table 18, particularly with regard to a Modified ShuffleNet trained on gradient pattern and shape-based features with enhanced M-SegNet segmentation. The models with the quickest computation times among those studied are BiLSTM (29.03 s) and CMN-ShuffleNet (29.15 s), demonstrating their effectiveness in real-time diagnostic applications. Conventional designs with higher latency, such as GRU (53.57 s) and LeNet (50.27 s), are less appropriate for applications that require quick responses. The suggested Modified ShuffleNet outperforms models like LinkNet (41.93 s) and ATT-DenseNet (38.82 s) while retaining superior classification accuracy to its 37.84-s computation time, and the AlexNet-SVM hybrid, speed of 31.68 s. The Modified ShuffleNet’s overall computing efficiency is encouraging, enhancing its usefulness for automated lung cancer diagnosis.

Table 18 Computational time analysis.

Comparison of the proposed model and the state-of-the-art technique

The comparison of the proposed model with prior attempts clearly reveals its higher performance in lung cancer categorization in Table 19. While methods like SVM (94.5% accuracy) and PCA-SMOTE-CNN (90.61% accuracy) offer promising results, they suffer with either poor accuracy or difficulties such as class imbalance, making them less trustworthy in different clinical scenarios. Other models that perform well include CNN and ViTs (98% accuracy) and Sampangi Rama Reddy et al.'s MFDNN (95.4% accuracy). However, the former is constrained by the specificity of the disease type (NSCLC), and the latter requires high-quality data that might not be readily available. Furthermore, the suggested solution employs a lightweight ShuffleNet architecture combined with modified recurrent residual blocks (mRRB), enabling effective segmentation and classification without excessive computational expense. This makes it particularly ideal for real-time clinical application, where both high performance and processing economy are critical. As a result, the suggested approach outperforms the cutting-edge methods indicated and offers a more reliable, effective, and accurate option for lung cancer detection.

Table 19 Comparison of the proposed method and the State of the art Technique.

Critical analysis

Multiple crucial features of automated diagnosis are addressed by the four-phase method, which includes histogram equalization, mRRB-SegNet-based lobe segmentation, robust feature extraction (including ILGP, shape, and statistical descriptors), and final classification using CMN-ShuffleNet. Though the measurements indicate better performance, the generalizability and the overfitting is entirely avoided. Adding histogram equalization enhances contrast, however it is important to carefully consider how it affects minor pathological traits. Performance evaluation metrics like accuracy, precision, F-measure, MCC (Matthews Correlation Coefficient), NPV (Negative Predictive Value), FPR (False Positive Rate), and FDR (False Discovery Rate) are crucial for evaluating the model’s robustness and dependability across a range of diagnostic dimensions when classifying lung cancer using a Modified ShuffleNet model trained on gradient pattern and shape-based features with enhanced M-SegNet segmentation. Metrics like accuracy, precision, recall, and F1-score have all significantly improved with DenseNet + Attention Mechanism16. By making use of its dense connections, DenseNet facilitates effective feature learning by guaranteeing that all levels have access to information from earlier layers. ATT-DenseNet prioritizes accuracy gains through more intricate networks and attention-based mechanisms, while Modified ShuffleNet with Gradient Features appears to be focused on computing efficiency, which could result in speedier real-time predictions. DenseNet models, however, can be computationally costly, particularly when dealing with bigger datasets. In contrast to the suggested ShuffleNet approach, which is lighter but may not perform as well in terms of accuracy, ATT-DenseNet uses attention to prioritize specific regions of the image, making it perfect for complex jobs where fine-grained concentration is required. The computational time analysis is carried out to ensure the efficiency of the proposed model over the existing methods.

The VGG16 model35 classified lung cancer into three groups with a high accuracy of 98.18%. Being a highly deep model, VGG16 can be computationally expensive, particularly when dealing with big datasets of CT scans. The memory and processing power of VGG16 are significantly higher than those of Modified ShuffleNet. ShuffleNet’s efficiency may make it more appropriate for applications requiring real-time classification with low latency, even though VGG16 delivers superior accuracy. Despite its accuracy, the VGG16 model is computationally demanding and less appropriate for contexts with limited resources. Due to their very small datasets, both models may not be as generalizable in a variety of clinical settings.

The ShuffleNet-based method does not appear to use data augmentation techniques, whereas LCGAN14 solves data scarcity issues by augmenting training with synthetic data. If there is not enough training data, this could lead to possible limits. To increase the model’s resilience and prevent overfitting, the GAN-based method can produce fake data. The ShuffleNet-based method, on the other hand, appears to concentrate on improving the segmentation and feature extraction procedures without using artificial data. Throughout the five folds, the proposed CMN-ShuffleNet continuously outperformed all other models using the improved M-SegNet segmentation and gradient pattern and shape-based feature training, achieving the highest accuracy in each case (range from 0.9379 to 0.9503). This highlights the model’s improved ability to detect lung cancer accurately and reduce missed diagnoses, which is a crucial requirement in clinical decision support systems.

Practical implications

Several significant fields in medical imaging and diagnostics are covered by the real-world uses of a Modified ShuffleNet trained on gradient pattern and shape-based features for lung cancer classification in conjunction with enhanced M-SegNet segmentation. By helping radiologists correctly identify and categorize lung nodules in CT scans, this integrated strategy can be implemented in hospital radiology departments and mobile health units, allowing for an earlier and more accurate diagnosis of lung cancer. Because of its lightweight design, which enables real-time analysis on low-power or embedded devices, it is perfect for applications requiring high-end computing resources, such as point-of-care diagnostics, telemedicine, and rural healthcare. Improved clinical results and more individualized patient care are finally made possible by the precise segmentation provided by M-SegNet, which also facilitates quantitative analysis, surgical planning, and treatment monitoring.

Conclusion

The study focused on classifying lung cancer through a series of systematic phases: image preprocessing, lobe segmentation, feature extraction, and classification. The Histogram Equalization was used to preprocess the input lung images. Next, the mRRB-SegNet model was employed to segment the lung lobes from the pre-processed images. A variety of features like ILGP, shape features, and statistical features, were retrieved after the segmentation process. These features then served as inputs in the classification phase, where the CMN-ShuffleNet model was employed to classify lung cancer. A comprehensive analysis of the CMN-ShuffleNet model’s performance was conducted, and the results revealed that it significantly outperformed existing approaches, leading to notable enhancements in classification accuracy. The CMN-ShuffleNet model attained an FNR of 0.061 in 80% of the training data, which stands out significantly lower than other methods. While GRU, LeNet, Bi-LSTM, LinkNet, ShuffleNet, modified AlexNet-SVM1, and 8ATT-DenseNet16 obtained higher FNR values of 0.187, 0.142, 0.167, 0.122, 0.134, 0.159 and 0.146, respectively.

Limitations

The limitations of the proposed model are that if segmentation fails, reliance on pre-segmentation accuracy (via M-SegNet) could increase classification mistakes. The limitation enhances discriminatory power for classifying lung cancer by combining gradient and shape-based characteristics. Faster inference is made possible by the lightweight, high-performing design of modified ShuffleNet. Better lung region segmentation via improved M-SegNet results in more precise downstream categorization.

Future work

To increase classification accuracy, multi-modal data (such as PET scans or medical records) should be integrated in future studies. Furthermore, a self-assessment and an Equator Network checklist can be carried out. Furthermore, future research include multi-center data and a variety of imaging modalities, including PET and MRI. Further works will try to incorporate physician-involved evaluations and prospective clinical trials to better reflect practical applicability, even if this study focuses on statistical evaluation.