Abstract
Feature selection (FS) is a significant dimensionality reduction technique, which can effectively remove redundant features. Metaheuristic algorithms have been widely employed in FS, and have obtained satisfactory performance, among them, grey wolf optimizer (GWO) has received widespread attention. However, the GWO and its variants suffer from limited adaptability, poor diversity, and low accuracy when faced with high-dimensional data. The hybrid rice optimization (HRO) algorithm is an emerging metaheuristic algorithm derived from the hybrid heterosis and breeding mechanism in nature. It possesses a robust capacity to identify and converge towards optimal solutions. Therefore, a novel approach based on multi-strategy collaborative GWO combined with the HRO algorithm (HRO-GWO) for FS is proposed in this paper. The HRO-GWO algorithm is enhanced by four innovative strategies including dynamical regulation strategy and three search strategies. First, to improve the adaptability of GWO, the dynamical regulation strategy is devised for parameter optimization of GWO. Then, a multi-strategy co-evolution model inspired by HRO is designed, which utilizes neighborhood search, dual-crossover, and selfing techniques to bolster population diversity. Finally, the study develops a hybrid filter-wrapper framework incorporating chi-square and the HRO-GWO algorithm to efficiently select pertinent and informative feature subsets, enhancing the classification performance while conserving time. The performance of HRO-GWO has been rigorously assessed across benchmark functions and the effectiveness of the proposed framework has been evaluated on small-sample high-dimensional biomedical datasets. Our experimental findings demonstrate that the approach on the basis of HRO-GWO outperforms state−of-the−art methods.
Similar content being viewed by others
Introduction
In the big data era, the feature dimensions of collected data have increased exponentially, from dozens of dimensions to tens of thousands of dimensions. It has rapidly increased the difficulty of data mining tasks1,2. Therefore, how to efficiently extract valuable information from data is a popular research topic in data mining and machine learning3,4.
Feature selection (FS) stands as a pivotal and extensively employed technique for dimensionality reduction that can obtain effective information from big data5,6. This method allows selecting the feature subsets that encapsulate the most pertinent features while retaining the inherent physical significance of the original data7. FS methods can be classified into three categories according to class labels: unsupervised, semi-supervised, and supervised8. Unsupervised methods have the capability to identify a subset of features without relying on class labels, but they may exhibit instability attributed to the absence of prior information9. Semi-supervised methods can handle subsets of features that contain both labeled and unlabelled data, but they rely heavily on the accuracy of the labeled data. In comparison, supervised methods tend to achieve superior FS results when abundant labeled data is available, benefiting from the inclusion of class labels.
Supervised FS methods involve three search strategies: exhaustive search, sequential search10, and random search. The initial two search methods are less efficient. However, random search introduces randomness into the search process, thereby yielding comparatively superior results with good efficiency. In recent years, several metaheuristic algorithms have been extensively used in FS owing to their powerful search capabilities in large−scale spaces11,12, such as Genetic Algorithm (GA)13,14, Aquila Optimizer (AO)15, Ant Colony Optimization (ACO) algorithm16, Sine Cosine Algorithm (SCA)17,18, Whale Optimization Algorithm (WOA)19, and Particle Swarm Optimization (PSO) algorithm20,21. Therefore, this paper selects a metaheuristic algorithm for FS of high-dimensional data and verifies the feasibility and superiority of the proposed method through experiments.
In 2014, the Grey Wolf Optimizer (GWO) was proposed22, a population-based metaheuristic algorithm that mimics the social hierarchy and group hunting behavior of grey wolves. Owing to its inherent simplicity, fewer requirements for control parameters, and strong optimization performance, the GWO has found extensive applications across engineering problems23, anomaly detection24, band selection25, path planning26,27, FS28,29, and other fields30,31,32. M. Mafarja et al.33 identify the primary limitation of the convergence factor strategy as its ability to transition an algorithm from an exploration phase to an exploitation phase, irrespective of the outcomes achieved thus far. To address this issue, they have proposed the introduction of a convergence control parameter (cp), which is designed to regulate the shift from exploration at the initial stages of the optimisation process to exploitation at the subsequent stages. Nadimi-Shahraki et al.23 have enhanced the hunting search strategy of wolves through the implementation of a new search strategy named dimension learning-based hunting (DLH). The objective is to address the weaknesses including deficiency in population diversity. J. Pirgazi et al.34 underscored the significance of feature relevance in big data analytics by proposing a gene selection method for high-dimensional datasets. The method is based on hybrid filter-wrapper metaheuristics, which are designed to facilitate effective FS in large−scale genetic datasets. In summary, the GWO algorithm has three main limitations in the process of FS for small-sample high-dimensional data:
-
1.
Limited adaptability The adaptability and balance between the exploration and exploitation of GWO are limited when using the linear change method for the convergence factor33.
-
2.
Poor diversity The population evolution mode of GWO is singular, which might be easily trapped in the local optimum23.
-
3.
Low accuracy When directly using the GWO algorithm for FS, it may lead to low accuracy by ignoring the relationship between information and the category34.
Therefore, the paper studies how to improve the GWO algorithm to address its three limitations in the field of high-dimensional FS. To alleviate these constraints, the concept of hybridizing with complementary metaheuristics has emerged as a promising approach.
Hybrid Rice Optimization (HRO) algorithm35, which takes inspiration from heterosis theory, is a newly developed metaheuristic algorithm. Its good performance in solving 0-1 knapsack problem36, band selection problem37, computer-aided diagnosis38, and intrusion detection39 demonstrates its notable search efficiency and robust global search capabilities. Furthermore, the concept of hybridization combined with metaheuristic algorithms has been successfully applied to FS40.
To overcome the deficiencies of GWO for FS in high-dimensional data, the paper proposes a novel framework based on multi-strategy collaborative GWO combined with HRO (HRO-GWO) utilizing four innovative strategies. Initially, within the original GWO algorithm and its variations, the convergence factor exhibits a linear decrease as the number of iterations increases, resulting in its poor adaptability. Thus, the paper initiates the adjustment of convergence factor through dynamic regulation strategy to facilitate the exploration of the search space at a higher number of iterations. Then, all grey wolves in the pack adjust their trajectory based on the same position update equation in the original GWO, and eventually, they will gravitate towards the same search direction. To address the issue of poor diversity, a multi-strategy co-evolution model is introduced to bolster population diversity. It allows the grey wolf to learn from multiple evolutionary algorithms and adapt its search behavior accordingly. The evaluation of HRO-GWO performance involves twelve CEC 2005 benchmark functions and thirty CEC 2017 benchmark functions. A comparison is presented with eleven metaheuristic algorithms, including PSO41, SCA42, WOA43, HRO35, GWO22, Multi-Strategy GWO algorithm (MSGWO1, 2020)44, Improved GWO (I-GWO)23, Multi-Strategy GWO (MSGWO2, 2023)33, Hippopotamus Optimization (HO) algorithm45, and Ivy algorithm (IVYA)46 to demonstrate the superiority of HRO-GWO. The details on the algorithms used for comparison in the paper is shown in Table 1. Furthermore, hybrid methods are widely utilized, given the presence of many irrelevant and redundant features in high-dimensional data34,47. FS methods are primarily grouped into three categories: filter, wrapper, and embedded48. Filter approaches evaluate features individually by finding the relationship between features and the class labels49, which may compensate for the shortcomings of GWO for high-dimensional FS tasks. Thus, to improve accuracy, chi-square filtering is used to screen the features while preserving the correlation between the information categories.
Twelve small-sample high-dimensional biomedical datasets are used as the experimental dataset. The Naive Bayesian (NB) algorithm and K Nearest Neighbor (KNN) algorithm are selected as classifiers to test the robustness of the proposed framework. Furthermore, the additional algorithms are incorporated into the comparison experiments including Elite Genetic Algorithm (EGA)50 and three methods based on variant HRO including Modified HRO (MHRO)37, improved binary ACO melded with HRO in the relay model (R-IBACO) and the collaborative model (C-IBACO)40, in addition to the eight algorithms included in the benchmark function test. The experimental outcomes demonstrate that high-dimensional FS framework utilizing the HRO-GWO algorithm proposed in the paper delivers satisfactory classification performance across diverse feature subset sizes.
To alleviate the three limitations of GWO in high-dimensional FS, the paper targets three solutions. The main contributions are as follows.
-
1.
To control the exploration and exploitation of the entire algorithm, dynamic regulation strategy is developed to adjust the convergence factor of the GWO algorithm based on linear change. It is employed to mitigate the deficiencies of the initial GWO in terms of its limited adaptability. Accuracy can be improved by up to about 1.2\(\%\).
-
2.
Aiming at the shortcomings of a single algorithm that may easily fall into the local optimum, a multi-strategy collaborative approach based on neighborhood search, dual-crossover, and selfing techniques is designed. It ensures that the GWO algorithm maintains strong exploitation ability while preserving diversity, which alleviate its poor diversity problem. Accuracy can be improved by up to nearly 2.5\(\%\).
-
3.
A hybrid filter-wrapper framework combining HRO-GWO and chi-square is proposed, which can save space and time resources while improving accuracy. It alleviates the problem of low accuracy of the original GWO in FS tasks. Accuracy can be improved by up to about 15.4\(\%\).
The remainder of the paper is organized as follows: Section “Related work” analyses recent developments in metaheuristic algorithms and FS methods. Section “Preliminaries” describes the preliminary concepts and theories; the high-dimensional FS framework of the HRO-GWO algorithm is introduced in Section “The proposed method”; following the experiments setup, results, and analysis are described in Section “Experimental results and discussions”; Section “Conclusion and future work” shows the conclusions and following research directions.
Related work
A substantial body of research has been conducted on the topic of FS. And metaheuristic algorithms, rooted in the emulation of natural evolutionary principles, have the characteristics of simple concepts, high efficiency, and easy implementation. In recent years, many scholars have opted to employ metaheuristic algorithms for addressing FS problems. This section will review recent research on metaheuristic algorithms and various FS methods.
Metaheuristic algorithms
Zamani et al.51 proposed an Evolutionary Crow Search Algorithm (ECSA) to optimize the hyperparameters of artificial neural network for diagnosing chronic diseases. The improved algorithm effectively explored and exploited problem spaces by maintaining population diversity through the implementation of innovative strategies. Salgotra et al.52 developed a Multi-Hybrid Differential Evolution (MHDE) and applied it to four engineering design problems and for the weight minimization of three frame design problems. The incorporation of adaptive parameters, enhanced mutation, and other features had yielded favorable outcomes.
Feature selection method based on other approaches
Sun et al.53 developed two mechanisms to propose several revisions of Binary Monarch Butterfly Optimization (BMBO) to improve classification efficiency for metaheuristic FS. Xie et al.54 created an enhanced multilayer binary firefly algorithm, resulting in higher classification accuracy with fewer features. Scholars have extensively investigated metaheuristic algorithms such as GA, PSO, and GWO, demonstrating their efficacy in addressing FS challenges. However, the search space of these metaheuristic algorithms will continue to grow as the problem size increases, resulting in low search efficiency, premature convergence, and trapping in the local optimum.
Jiang et al.55 introduced a Bayesian Robit regression approach with Hyper-LASSO priors (BayesHL), which is proposed as for FS in high-dimensional genomic data with grouping structure. It was an effective tool for predictive capability, sparsity and the ability to uncover grouping structures. Nevertheless, the aforementioned approach had not been applied to other models. A two-step hybrid FS method was developed by Moslemi et al.56. The conjunction of sparse subspace learning with nonnegative matrix factorization and GA produced a powerful effect. However, the generalization ability of the algorithm may be influenced by the sample size.
Feature selection method based on GWO
Wang et al.57 developed a role−oriented binary GWO. The effectiveness of the method was illustrated by the results of the benchmark and FS tasks. Zhou et al.44 introduced MSGWO1 incorporating random guidance, local search, and subgroup cooperation strategies for FS. The study demonstrated its capability to efficiently identify the optimal feature combination and enhance the performance of the classification model. Despite the demonstrated effectiveness of GWO in addressing FS challenges, the efficacy of the algorithms is observed to diminish as the problem size increases. It still has limitations in the process of FS for high-dimensional data.
Preliminaries
Grey wolf optimizer
The GWO algorithm22 is a metaheuristic algorithm proposed by simulating the predation process of wolves. Its inspiration comes from the social leadership and hunting behavior of grey wolves in nature. Within this social hierarchy, wolves are divided into four classes: \(\alpha\), \(\beta\), \(\delta\) and \(\omega\). In the GWO algorithm, \(\alpha\) wolves, \(\beta\) wolves, and \(\delta\) wolves are taken as the best solution, and the rest of \(\omega\) wolves are directed towards promising areas to locate a global solution. The process of wolf hunting comprises three primary stages: encircling, hunting, and attacking prey.
Encircling prey Throughout the optimization process, when a group of grey wolves locates prey, they gradually encircle it. The behavior can be expressed by Eqs. (1 and 2),
where D denotes the current distance between the grey wolf and prey, and t is the current iteration. \({X_p^t}\) is the position of prey in the iterative process. \({X^t}\) is the position of grey wolf. A and C are coefficient vectors calculated by Eqs. (3 and 4),
where \(m_1\) and \(m_2\) are random vectors in the interval \(\left[ {0,1}\right]\), a is the convergence factor whose initial value is 2 and decreases linearly to 0 over the course of iterations expressed as Eq. (5),
where \(t_{max}\) indicates the maximum number of iterations.
Hunting prey The formula for guiding the position update of other wolves in the group, taking \(\alpha\), \(\beta\), and \(\delta\) wolves as the highest quality grey wolves in the space, is described as Eq. (6),
where,
where the positions of the three wolves, \(\alpha\), \(\beta\), and \(\delta\) in the search space are denoted by \(X_\alpha\), \(X_\beta\), and \(X_\delta\). \(D_\alpha\), \(D_\beta\), and \(D_\delta\) represent the distances between the current grey wolf and wolves \(\alpha\), \(\beta\), and \(\delta\) respectively. The position update formula for \(\omega\) wolves is described as Eq. (8),
Attacking prey After the hunting process, the grey wolf begins to attack its prey. The value of A ranges from -2 to 2 as a decreases linearly from 2 to 0 during the iterative process. When A is within the interval [-1, 1], the grey wolf will explore the search space near the current prey.
Hybrid rice optimization algorithm
The HRO algorithm35 is a metaheuristic algorithm inspired by heterosis theory. It comprises three phases: hybridization operations, selfing operations, and renewal operations.
In each iteration, the population of rice seed is arranged by fitness level from superior to inferior and divided into three subpopulations of equal size. The subpopulation with the highest fitness level is chosen as the maintainer line, while the lowest is the sterile line, leaving the remaining subpopulations as the restorer line.
Hybridization It is employed to renew the rice seed genes in the sterile line. Two rice seed populations with large differences in traits are randomly chosen from the maintainer line and the sterile line to generate new individuals. If the newly generated rice seed exhibits superiority over the incumbent seed, it will supersede the current seed. The process of generating new individuals could be expressed by Eq. (9),
where \(r_1\) and \(r_2\) are random values in the interval \(\left[ {0,1}\right]\). \(X_{s,a}^d\) represents the \(d\textrm{th}\) genes of the randomly chosen rice seeds from the sterile line. \(X_{m,b}^d\) denotes the \(d\textrm{th}\) genes of the randomly selected rice seeds from the maintainer line.
Selfing In the restorer line, selfing guides rice seeds to move closer to the current optimal solution, and the individuals can be updated as Eq. (10),
where \(r_3\) is a random value in the interval \(\left[ {0,1}\right]\). \(X_{best}^d\) signifies the \(d\textrm{th}\) genes of the optimum rice seeds. \(X_{r,j}^d\) denotes the \(d\textrm{th}\) genes of the randomly chosen rice seeds from the restorer line. \(X_{r,i}^d\) indicates the \(d\textrm{th}\) genes of the current rice seeds.
Renewal When the rice seed in the restorer has not been upgraded for \(SC_{max}\) successive times which is a pre−set parameter, the seed will be reset as shown in Eq. (11),
where the random value \(r_4\) is in the interval \(\left[ {0,1}\right]\). \(X_{max}^d\) and \(X_{min}^d\) indicate the maximum and minimum limits of the \(d\textrm{th}\) dimensional search space.
Dimension learning-based hunting search strategy
Nadimi-Shahraki et al.23 introduced the Dimension Learning-based Hunting (DLH) search strategy. It employs the distance between the current position of each wolf and the updated position as radius and uses the current position as center to construct a domain to search for neighbors. The search radius \(R_i^t\) is calculated using Eq. (12).
Neighbor set \(N_i^t\) of grey wolf \(X_i\) at iteration t is expressed as Eq. (13),
where \(D_i\left( X_i^t,X_j^t\right)\) denotes the Euclidean distance between grey wolf \(X_i^t\) and grey wolf \(X_j^t\), and \(j\ne i\). The formula for creating a new wolf is expressed as Eq. (14),
where \(X_{idlh}^t\) is the new individual. \(N_{i,r}^t\) represents an individual randomly selected from the neighbor set. \(X_r^t\) denotes a randomly chosen individual in the current population.
Chi-square feature selection
The chi-square test is a normalized statistic used to determine whether the difference between observed and expected frequencies is statistically significant. Chi-square FS \(\chi ^2\) chooses features based on their score in relation to the target variable. The \(\chi ^2\) is defined as Eq. (15),
where N denotes the total number of samples. A represents the number of samples containing the feature t and labeled as category c. B is the number of samples with feature t but not labeled as category c. C represents the number of samples without feature t and belongs to category c. D indicates the number of samples that do not include t and do not belong to c.
The proposed method
In this section, the overall framework of the proposed method is first outlined. Furthermore, the HRO-GWO algorithm, dynamical regulation strategy, and multi-strategy co-evolution model, the framwork combining chi-square and HRO-GWO are explained in detail.
Model overview
Feature selection framework based on HRO-GWO algorithm.
In the study, a novel framework based on HRO-GWO is proposed with the aim of improving the performance of GWO for high-dimensional FS tasks. The proposed model can be divided into three distinct phases: data preprocessing, FS, and classification evaluation, the proposed hybrid filter-wrapper framework integrates chi-square and HRO-GWO into the process.
The architecture of the proposed framework is pictured in Fig. 1. The paper initially undergoes data processing before applying the chi-square filtering. Then, chi-square is used to coarsely filter high-dimensional features to obtain candidate feature subsets. Subsequently, the proposed HRO-GWO algorithm is employed to select the most valuable features from the candidate feature subset according to the feedback results of the classifier to create the final feature subset. Finally, the selected feature subset is served as input of the classifier to obtain the final evaluation result.
Feature selection based on the HRO-GWO algorithm
Aiming at the problems of poor adaptability and diversity of the GWO algorithm, the HRO-GWO algorithm introduces an adaptive adjustment strategy and three update strategies for individuals.
The HRO-GWO algorithm is originally for continuous space, whereas the FS problem is a typical discrete problem. To solve the problem, a conventional binary coding technique can be utilized to convert the metaheuristic algorithms originally designed for optimizing solutions in continuous spaces.
The HRO-GWO algorithm
The HRO-GWO algorithm has two keypoints. One is to propose the dynamical regulation strategy of convergence factor to enhance the balance between exploration and exploitation of GWO. The other is to design a multi-strategy co-evolution model by incorporating three additional grey wolf position update strategies: neighborhood search, dual-crossover, and selfing. The model solves the problem that the algorithm tends to fall into local optimums. The flowchart for the HRO-GWO algorithm is depicted in Fig. 2.
The flowchart of the HRO-GWO algorithm.
The pseudocode of HRO-GWO is shown in Algorithm 1. The HRO-GWO algorithm initiates by randomly generating an initial population within the predefined search space. At each iteration, the fitness function assesses the positions of wolves and identifies the top three with the best fitness value, designated as \(\alpha\), \(\beta\), and \(\delta\). Subsequently, the dynamical regulation strategy is developed to adjust the convergence factor a. Then, each wolf updates its position through the original update strategy and three novel update strategies. The iterative process continues until reaching the predetermined number of iterations (\(t_{max}\)), which serves as the stopping criterion.
Pseudocode for the HRO-GWO algorithm
Binary encoding rules and fitness function
To transform continuous data into discrete data, the paper utilizes a conversion function. The HRO-GWO algorithm converts the real value of each bit into 1 or 0, 1 indicates that the corresponding feature will be used for training, while 0 indicates that it will not be used. The formula is described as Eq. (16).
The paper employs the FS method to minimize the fitness function aims at enhancing classification accuracy and reducing redundant features. The fitness function is described as Eq. (17),
where fitness denotes the fitness value, error is the classification error rate. The \(n_s\) and \(n_c\) represent the number of selected feature subsets and the total number of features, respectively. The weighting factor \(\alpha\) is employed to balance the classification error rate and the number of selected features, and its range is [0,1]. For the paper, \(\alpha\) is set to 0.99.
Dynamical regulation strategy for convergence factor
The GWO algorithm controls the global search or the local search by the value of coefficient A. As the value of A changes with the convergence factor a, the factor is essential in achieving a balance between exploration and exploitation in the algorithm.
The application of fixed convergence factor settings may result in a lack of adaptability of the algorithm. The transition from the exploration to the exploitation is not accompanied by the possibility of adjustment. Inspired by literature58, the paper proposes a convergence factor adaptive strategy called Dynamical Regulation Strategy (DRS). DRS utilizes Eq. (18) instead of the original linear function, enabling a non-linear decrease in values throughout the iteration. It permits the adjustment of both exploration and exploitation. Hence, this strategy can direct the algorithm to enhance local search from the early stages of the iteration and to augment the probability of exploration towards the later stages of the iteration. It addresses the deficiency in adaptability inherent to the original algorithm.
where t is the current iteration, and \(t_{max}\) is the maximum iteration. \(\varepsilon\) and \(\omega\) are adjustment factors. This strategy allows the algorithm to explore more extensively in the early stages of iteration and conduct more effective local searches in the following stages.
Multi-strategy co-evolution model
In the GWO algorithm and its variations, all grey wolves in the group change their trajectory based on the same position update equation. Eventually, they will gravitate towards the same search. Therefore, this section introduces the idea of multi-strategy co-evolution, allowing the grey wolf to adjust its search behavior from multiple perspectives. The Neighborhood search strategy enables the current wolf to learn the characteristics of its good neighboring wolves, thereby conducting a local search. Dual-crossover strategy employs a more randomized approach to the learning process for the three wolves at the outset of the iteration. It is necessary to ensure that some choice is not permanently excluded from consideration, thus avoiding the potential disadvantage of smaller numbers of wolves that have made superior choices but are not selected. Concurrently, the implementation of a randomly selected approach between it and Selfing strategy serves to circumvent the issue of diminished convergence velocity resulting from an excess of randomness during the initial phase. The combination of these three strategies serves to enhance the diversity of the population and to mitigate the tendency to fall into local optima.
Neighborhood search strategy
The Neighborhood Search Strategy (NSS) is an optimization technique that begins with an initial solution and then searches for individual solutions within its neighborhood. The technique is often more effective than global search in discovering the current optimal solution in a vast search space.
The process of generating new individuals through neighborhood search.
In the GWO algorithm, \(\alpha\), \(\beta\), and \(\delta\) wolves guide the entire population forward, resulting in slow convergence of the GWO algorithm and an increased likelihood of getting trapped in a local optimum. To solve the problem, NSS inspired by the DLH search strategy is developed. The process of neighborhood search to generate new individuals is shown in Fig. 3. The updated position of the current wolf in iteration t is represented by \(X_{ins}^t\). The individual is generated by Eq. (19),
where,
and,
where \(X_i^t\) is the current position of the wolf in iteration t. \(N_{i,r}^t\) represents an individual randomly selected from the neighbor set. \(X_r^t\) represents a randomly selected individual in the current population.
Dual-crossover strategy
The inspiration for the Dual-Crossover Strategy (DCS) comes from hybridization operations of the HRO algorithm. It performs a hybridization operation on \(X_\alpha\), \(X_\beta\), and \(X_\delta\) while retaining the information of \(X_i\) to generate new individuals. In the GWO algorithm, a single evolutionary strategy causes the population to lose diversity prematurely. Therefore, DCS uses the crossover strategy combined with hybridization techniques to improve the way individuals are updated, as shown in Fig. 4. The individual for the first cross-update \(X_{cronew}\) is generated by Eq. (22),
The process of generating new individuals through dual-crossover.
where,
where A is calculated by Eq. (3), \(C_{R1}\) is set to 0.8 in the paper. \(X_1^d\) is the d dimension gene of individual \(X_1\). \(r_1\), \(r_2\), and \(r_3\) are random values in the interval \(\left[ {0,1}\right]\). \(X_{\alpha }^d\), \(X_{\beta }^d\), and \(X_{\delta }^d\) are the d dimension genes of individual \(X_{\alpha }\), \(X_{\beta }\), and \(X_{\delta }\).
The individual for the second cross-update \(X_{icross}\) is generated by Eq. (24),
where \(X_i^d\) denotes the d dimension gene of the current wolf \(X_i\). \(X_{cronew}^d\) is the d dimension gene of the individual for the first cross-update \(X_{cronew}\). \(C_{R2}\) is set to 0.3 in the paper.
Selfing strategy
In the original GWO algorithm, the wolf position is updated in line with the optimal solution, resulting in the GWO easily trapping in a local optimum. Inspired by selfing operations in the HRO algorithm, the paper presents the Selfing Strategy (SS) to enhance the global search capability by combining genes among the current wolf, a randomly selected wolf, and \(\alpha\) wolf. The individual is generated by Eq. (25),
where \(r_4\) is a random value in the interval \(\left[ {0,1}\right]\). \(X_\alpha ^d\) signifies the \(d\textrm{th}\) genes of the \(\alpha\) wolf. \(X_r^d\) denotes the \(d\textrm{th}\) genes of the randomly chosen wolf from the population. \(X_i^d\) indicates the \(d\textrm{th}\) genes of the current wolf.
Feature selection framework based on chi-square and HRO-GWO
The application of metaheuristic algorithms in isolation renders high-dimensional FS less efficient, while an excessively expansive search space constrains their performance. Additionally, metaheuristic algorithms execute FS without consideration of the interrelationships between features and targets. The chi-square filter is an efficient method for removing the least relevant features. It is particularly well suited to address the relationship between categorical features and the target variable, which may allow for the rapid identification of important features. In most cases, it is a more efficient computation method and is well-suited to high-dimensional datasets. The incorporation of chi-square filtering enables HRO-GWO to conduct searches within the requisite feature dimensions. This approach not only reduces the time required but also enhances the capability. Consequently, the proposed hybrid filter packing framework combines chi-square and HRO-GWO.
It starts with a coarse filtering of the dataset using the chi-square technique. It calculates the chi-square FS \(\chi ^2\) score, which selects only the necessary and relevant features, yielding a subset of candidate features. Moreover, the percentage of chi-square filtering can be chosen according to the specific situation.
After filtering a subset of candidate features with chi-square, this portion of the feature set serves as input into HRO-GWO for screening. The HRO-GWO algorithm selects the most valuable features from the candidate feature subset according to the feedback results of classifier. It is iteratively updated until it outputs a binary string representing the final feature subset.
Time complexity analysis
The time complexity of HRO-GWO is contingent upon two principal factors: initialization, individual evaluation and update. Therefore, the total time complexities of HRO-GWO is \(O(N + T \times (N\times (3\times D)))\). T represents the number of iterations. N denotes the population size. D symbolizes the dimension of individuals. The time complexity of GWO, HRO, their variants, and HRO-GWO at each stage is subjected to a comprehensive analysis in Table 2. Among them, the time complexity of GWO, I-GWO, and MSGWO2 is derived from the same sources as that of HRO-GWO. Subsequently, the HRO, MSGWO1, MHRO, R-IBACO, and C-IBACO must then assign a ranking to all fitness values following the updating of individuals at each iteration. For C-IBACO, \(N_1\) and \(N_2\) signify the subpopulation sizes of HRO and IBACO. Furthermore, pheromone updates are necessary for R-IBACO and C-IBACO. Although the hybrid algorithm increases the time complexity, it markedly enhances the model’s performance, which is deemed acceptable.
Experimental results and discussions
This section evaluates the capabilities of the HRO-GWO algorithm on various test functions and the performance of the framework based on HRO-GWO on small-sample high-dimensional biomedical datasets.
The research in the paper focuses on improving the deficiencies of the grey wolf optimization algorithm in small-sample high-dimensional FS tasks and proposing the HRO-GWO algorithm. The CEC benchmark function provides a standardized testing platform that enables researchers to evaluate the performance of various optimization algorithms under identical conditions. Accordingly, the performance of HRO-GWO is initially assessed through the utilization of the CEC benchmark functions. Subsequently, the efficacy of the hybrid FS framework based on HRO-GWO is evaluated using 12 biomedical datasets for dimensionality reduction on small-sample high-dimensional data. In the experiments, the framework based on HRO-GWO and chi-square is compared with other filtering methods and metaheuristic algorithms. The robustness of the method is verified by employing various classifiers and by varying the dimensionality reduction rate. Moreover, the validity of the results is verified using statistical evaluation. Finally, ablation studies are conducted on the proposed method to illustrate the impact of each strategy. It substantiates the assertion that the combination of these strategies optimizes the performance of the approach.
Experiment on benchmark functions
Twelve test functions from the CEC 2005 frequent benchmark suite and twenty-nine from the CEC 2017 benchmark suite59 are applied to evaluate the performance of HRO-GWO. The results of the HRO-GWO are compared with the state−of-the−art metaheuristic algorithms: PSO41, SCA42, WOA43, HRO35, GWO22, MSGWO1 (2020)44, I-GWO23, MSGWO2 (2023)33, HO45, and IVYA46. As shown in Table 3, in all experiments, the parameters of the comparative algorithms are set to the recommended values as specified in their original studies.
CEC 2005 test functions include unimodal (F1-F3), ordinary multimodal (F4-F6), and fixed-dimensional multimodal functions (F7-F12). Six benchmark functions (F1-F6) are evaluated with different dimensions of 10, 20, and 30. Six benchmark functions (F7-F12) are evaluated with their stable dimensions. Each of them is evaluated by 30 independent runs. The total number of function evaluations of each algorithm is 50000. Table 4 shows the function expressions, test dimensions, variable ranges, and optima for the twelve benchmark functions.
Tables 5, 6 and 7 display the results for HRO-GWO and baseline methods for twelve test functions by the best, average, worst, and standard deviation of fitness, and the best values are in bold. Moreover, the last of each table, labeled “w/t/l”, shows the number of wins (w), losses (l), and ties (t) of each algorithm. The results of the comparative analysis are illustrated in Fig. 5. Specifically, regarding the best, average, and worst, the HRO-GWO emerges victorious 66.7% of the time across all three metrics. For the average value, HRO-GWO achieves a remarkable score of 95.8%. Concerning the worst value, HRO-GWO attains a high value of 91.7%. In terms of the best value, HRO-GWO maintains a substantial proportion of 79.2%.
The unimodal test functions are well-suited for assessing the exploitation capability of algorithms in finding the optimal solutions. Upon reviewing the results presented in Table 5, the HRO-GWO algorithm delivers highly competitive outcomes on unimodal test functions. Particularly, it significantly shows enhanced results on F2 across all dimensions and evaluation metrics. In the evaluation of the unimodal test functions through nine comparisons, HRO-GWO emerges as the top performer in the average case comparisons on all nine occasions, and in the best and worst case comparison on eight occasions. HRO-GWO does not ranks first for the optimal value on F1 with dimension 30 and the worst value on F3 with dimension 10. However, it achieves the best results in terms of best, worst, and average fitness values in all other cases.
According to the results reported in Table 6, the experiments conducted on F4-F6 across various dimensions, where the complexity escalates with the increase in dimensions, demonstrate competitive performance in terms of exploration of the algorithm. Out of nine comparisons of the ordinary multimodal functions, HRO-GWO wins nine times in the comparison of average value and worst value comparison, five times in the best value comparison. On F4, both MSGWO2 and HRO-GWO achieve the theoretical optimum of 0 for all dimensions. The experiment is performed on F5 and F6, in which HRO-GWO outperforms other algorithms in regard to the average and worst fitness values for all dimensions. These results demonstrate the efficacy of the proposed HRO-GWO algorithm in addressing the challenges posed by ordinary multimodal functions.
The number of wins, losses and ties of each algorithm.
The results presented in Table 7 show that HRO-GWO is superior in most fixed-dimensional multimodal functions. In the comparison of the fixed-dimensional multimodal functions across six experiments, HRO-GWO achieves the optimal average value in five instances. In terms of the best value and worst value, it won in six and five instances, respectively. Among them, HRO-GWO can achieve the highest theoretical quality in terms of three evaluation metrics on F7, F8, and F9. As the results illustrate, HRO-GWO exhibits an excellent balance between exploration and exploitation that effectively avoids the local optima to the maximum extent.
Furthermore, the CEC 2017 benchmark set comprising 29 highly challenging single objective optimization problems is employed for conducting performance testing. To evaluate the performance of the proposed HRO-GWO algorithm, some algorithms are selected for comparison. These include the original GWO algorithm22, the HRO algorithm35, and two more recent algorithms HO45, IVYA46, which performed well in the CEC competition. The total number of function evaluations of each algorithm is 50000, and each of them is evaluated by 30 independent runs. Table 8 presents the best, average, worst, and standard deviation of fitness for each dimension size of 10D, 30D and 50D, respectively. And the best values are in bold. Moreover, the last of the table, labeled “w/t/l”, shows the number of wins (w), losses (l), and ties (t) of each algorithm. The results indicate the efficacy of HRO-GWO in addressing optimisation problems.
For the first case of unimodal problems, F3, HRO-GWO achieves excellent results on the dimension of 30 and 50. HO and IVYA attain favourable outcomes in the remaining situation. For the multimodal problems, F4 to F10, in a test of dimension 10, HRO-GWO exhibited excellent performance in more than 85% of the functions. For F5 , F6 , F7 , F8 , and F9 functions, the algorithm performs better in comparison to others in the dimension of 30 on the average value. At dimension 50, the HRO-GWO demonstrates optimal performance on F6, F7, and F9. For hybrid problems, HRO-GWO win 20, 14 and 18 times in the average, optimal and worst values, respectively. For F21 to F30 composite problems, HRO-GWO yielded the most optimal outcomes across all dimensions of F27 and F29, as well as across all evaluation indicators. In all competitions, HRO-GWO has achieved a first-place ranking on 132 occasions, while IVYA has attained a second-place ranking on 76 occasions.
Experiment on biomedical datasets
To further investigate the superiority of proposed methods, the experiments are conducted on twelve public small-sample high-dimensional biomedical datasets by 20 independent runs. Table 9 summarizes the characteristics of these datasets, including the number of features, instances, and classes. The datasets are available at https://github.com/taiyang479/HRO-GWO.git.
First, owing to the effectiveness of hybrid filter-wrapper method has been proven49,60, to ensure fairness, the study applies the chi-square technique as a filter approach to select 10\(\%\) or 20\(\%\) features for all algorithms to improve the classification performance while saving time. It enables all algorithms to search within the same feature space. Then, to avoid model overfitting, ten-fold cross-validation is used to generate the training and test sets owing to the small number of samples in biomedical datasets. The datasets are divided into ten parts, with nine employed as the training set and one as the testing set during the FS process. Both NB and KNN are relatively straightforward classification algorithms that are straightforward to implement and comprehend. They do not necessitate complex parameter tuning or training processes and are particularly well-suited as benchmark classifiers to assess the efficacy of FS. Consequently, the classifiers NB and KNN are selected to test the robustness of proposed method. To verify the generality of the method, the same parameters are applied to the algorithm when using different classifiers. In the context of biomedical data classification, accuracy represents a fundamental metric for the evaluation of model performance. Furthermore, reduction of redundant features represents a pivotal objective within the domain of FS. Accordingly, the fitness function is constructed to minimize the number of features and the error rate. Average fitness, average classification accuracy, and average number of selected features are selected as the principal evaluation metrics to provide a comprehensive performance appraisal of different algorithms.
Comparison to other FS approaches
To ascertain the superiority of chi-square filtering, the study conducts comparative experiments using three filtering methods: Decision Tree, ReliefF, and Mutual Information (MI). The classifiers NB and KNN are selected for their ease of implementation. The filtering percentages are set at 10% and 20%, respectively, based on empirical evidence. The fitness values of the methods above are presented in Table 10, with the optimal values highlighted in bold. Moreover, the last of each table, labeled “w/t/l”, shows the number of wins (w), losses (l), and ties (t) of each algorithm.
The experimental results demonstrate that the HRO-GWO framework, when combined with chi-square filtering, achieves the optimal fitness value on a total of 25 occasions. In comparison, Decision Tree, ReliefF, and MI achieve the optimal fitness value on 3, 6, and 13 occasions, respectively. The approach based on HRO-GWO and chi-square filtering is the most effective in the study.
A fair comparison is conducted with ten state−of-the−art approaches mentioned in the previous section, and EGA50, MHRO37, R-IBACO and C-IBACO40. Each of them is evaluated by 20 independent runs. In each run, the total number of function evaluations of each algorithm is set to 50000. As illustrated in Table 11, the parameters of the comparative algorithms are set to the recommended values as originally specified in their respective studies. In the evaluation of high-dimensional biomedical datasets through forty-eight experiments, the framework based on HRO-GWO emerges as the top performer in the fitness value, accuracy, and number of features comparison on the majority of occasions.
The fitness values of this group of methods are displayed in Table 12, and the best values are in bold. The proposed methods achieve the optimal fitness value under most conditions. For example, while remaining 10% of the features before implementing algorithms to the GLIOMA dataset on NB, the fitness value of HRO-GWO is up to 6.180e−02. Therefore, the value is greater than 2.110e−01, 1.689e−01, 1.850e−01, 2.501e−01, 1.736e−01, 9.673e−02, 9.672e−02, 8.331e−02, 9.442e−02, 1.462e−01, 7.127e−02, 7.762e−02, 1.395e−01, and 1.991e−01, which are given by PSO, SCA, WOA, EGA, HRO, GWO, MSGWO1, I-GWO, MSGWO2, MHRO, R-IBACO, C-IBACO, HO, and IVYA, respectively.
The convergence results are shown in Figs. 6 and 7, indicating that HRO-GWO quickly converges to the vicinity of the most promising region in the search space while maintaining a certain level of global exploration capability in the following iterations. The DRS balances exploitation and exploration capability of the algorithm. Multi-strategy co-evolution model enables certain grey wolves within the pack to specialize in global search, while others concentrate on local search. This approach enhances diversity of the population and mitigates the risk of getting trapped in the local optima. For example, the iterative curves of the algorithms on the CNS dataset utilizing the NB classifier with a filtering ratio of 10 of features are presented in subfigure 7a of Fig. 7. The HRO-GWO achieves a better fitness value in the shortest possible time. Furthermore, as illustrated in subfigure 6a, the curve of HRO-GWO continues to fluctuate after 800 iterations, demonstrating that the algorithm possesses the capacity to transcend the limitations of localized solutions in the subsequent stages of the algorithmic process. However, the iteration curve of the original GWO algorithm and the majority of variants do not demonstrate a downward trend at a later stage.
Average fitness convergence curves on the first six datasets. P represents the proportion of selection features.
Average fitness convergence curves on the last six datasets. P represents the proportion of selection features.
The comparison results of accuracy and the number of selected features are presented in Table 13, and the best values are in bold. Figure 8 illustrates the accuracy achieved by each algorithm in conjunction with two classifiers and two filtering ratios. Under four distinct experimental conditions across twelve datasets, HRO-GWO demonstrates the highest average classification accuracy in most cases. In certain datasets, minor differences are exhibited in the average accuracy among algorithms. For instance, the Leukemia 2 datasets show that all the algorithms have a high level of accuracy in four conditions. HRO-GWO provides the highest accuracy of up to 99.995%, while PSO provides the lowest accuracy of 97.054%. However, it is noted that some datasets exhibit substantial variations in experimental outcomes. For example, in the TOX_171 dataset, the classification accuracies of the best and worst are recorded at 94.641% and 73.628%, which are given by HRO-GWO and EGA, respectively.
Average accuracy on NB and KNN. Percent represents the proportion of selection features.
Average selected features on NB and KNN. Percent represents the proportion of selection features.
The number of wins and ties for three metrics.
Figure 9 displays the average number of features selected by each algorithm. In the Leukemia 2, Lung_cancer, and Ovarian datasets, HRO-GWO selects the smallest average number of features but still obtains the optimal average accuracy. Particularly within the Ovarian dataset, HRO-GWO selects only 3.2 features but achieves an impressive accuracy of 99.999%. Furthermore, the gap between the maximum and minimum number of selected features amounts to 1472.8 in the Ovarian dataset. It is evident that the framework incorporating HRO-GWO exhibits a remarkable capacity for dimensionality reduction across high-dimensional datasets.
Table 14 presents a summary of the total number of wins, losses, and ties for all three metrics across all scenarios. As shown in Fig. 10, for fitness value, HRO-GWO wins in all cases. Concerning accuracy, HRO-GWO demonstrates the highest accuracy on a total of 35 occasions, while a tie is recorded on 11 occasions. It illustrates the capacity of HRO-GWO to select the most pertinent characteristics while ensuring accuracy. Regarding the number of features, HRO-GWO exhibits the lowest number remaining 23 times, thereby indicating that HRO-GWO has an excellent dimensionality reduction rate.
Nonparametric test
The results on the NB and KNN classifiers are evaluated through the implementation of Wilcoxon signed-rank test and Friedman test. The objective is to ascertain the significance of the discrepancy in accuracy between the proposed HRO-GWO and the comparative metaheuristic-based FS methods.
As shown in Table 15, A Wilcoxon signed-rank test is conducted as a pair-wise assessment. \(R^{+}\) represents the aggregate of the ranks at which HRO-GWO exhibits superior performance relative to the comparative method. \(R^{-}\) indicates negative rank summation. The P value denotes the level of significance, with \(P<0.05\) signifying a statistically notable discrepancy between the two algorithms under comparison. The P values displayed in the table are less than 0.05, which indicates that the findings of the study are statistically significant.
The results of the Friedman test are presented in Table 16, which includes the average accuracy rankings across all datasets, the final rankings, and the P values for each algorithm when utilizing both classifiers. The P value of 9.860e−55 and 8.867e−52, which are less than the pre−determined significance level of 0.01, indicates that there is a statistically significant difference between the algorithms. Furthermore, the results of final ranking demonstrate that HRO-GWO exhibits the highest degree of accuracy when evaluated using NB and KNN.
Ablation studies
To comprehend the impact of DRS and the multi-strategy co-evolution model added on the basis of GWO in the paper on performance, we analyze the results of adding each strategy in three datasets.
The layer-by-layer performance analysis is shown in Table 17, where MCM stands for multi-strategy co-evolution model. And the best values are in bold. Following the addition of DRS, the performance of GWO exhibited a slight enhancement, although it remained suboptimal. After incorporating the multi-strategy co-evolution model, the performance has been improved to a discernible extent, indicating that the improvement of population diversity has an important effect on GWO. The optimal outcomes are attained when the original GWO algorithm employs both DRS and multi-strategy co-evolution model. For instance, when comparing the proposed method with the original algorithm at a specific value of percent = 20, it is evident that HRO-GWO yields the most promising outcomes on the CLL_SUB_111 dataset using KNN as a classifier. Specifically, GWO achieves a fitness score of 4.736e−02 after incorporating DRS, which represents an increase of 1.234e−02 compared to the original algorithm. Additionally, the multi-strategy co-evolution model has improved the fitness value of GWO to 4.096e−02, verifying its superiority. The HRO-GWO algorithm combines both methods and has a notable fitness value of 3.100e−02. In general, the optimization mechanism proposed in the paper plays a key role in improving the performance of GWO.
Table 18 demonstrates a detailed comparison of the three update strategies in the multi-strategy co-evolution model added separately. And the best values are in bold. In most cases, the application of each strategy contributes to reinforcing the GWO. However, when only two of them are combined, they may show varying enhancement effects across diverse datasets, classifiers, and dimensions. The most substantial impact is realized when these strategies collaborate synergistically.
To assess the impact on the performance of the chi-square technique integrated into the foundation of the HRO-GWO-based framework, we examine the mean fitness value and time across three datasets, as shown in Table 19. And the best values are in bold. The experimental findings demonstrate that substantial time savings are realized alongside enhancements in accuracy following the incorporation of the chi-square technique for the FS framework. It is worth noting that in experiments with KNN, the accuracy of the LFPF\(\_\)1 dataset can be improved by about 15.4% while saving 8611 seconds.
Conclusion and future work
The paper proposes a novel FS framework based on a modified GWO to solve small-sample high-dimensional FS tasks. The HRO-GWO algorithm incorporates four innovative strategies including DRS and three search strategies to improve its performance. DRS aims to adjust the essential parameters of the GWO algorithm during the optimization process to enhance its adaptability. Subsequently, a multi-strategy co-evolution model is derived from the inspiration of HRO. Obviously, the poor population diversity is effectively mitigated by the model combining NSS, DCS, and SS. Moreover, to enhance the classification performance while conserving time, a hybrid filter-wrapper framework combining chi-square and HRO-GWO is designed to efficiently select relevant and informative feature subsets. Experimental results show that the capability of HRO-GWO and the performance of proposed FS framework based on HRO-GWO outperforms the competing methods employed in the study.
In recent years, metaheuristic algorithms have become increasingly popular for solving high-dimensional problems in big data and need more grouping strategies with the rapidly expanding massive data. The work provides independent insights into the FS method of the GWO algorithm, which can be used as a guide for future development. One goal of HRO-GWO is to reduce time costs while ensuring diversity in the future. Multi-strategy collaborative methods inevitably increase the running time of the algorithm while improving population diversity. Therefore, how to balance classification accuracy and time cost will become a difficult point in future work.
Data availability
The datasets are available at https://github.com/taiyang479/HRO-GWO.git. For further needs contact the corresponding author.
References
Atashgahi, Z. et al. Supervised feature selection with neuron evolution in sparse neural networks. arXiv preprint arXiv:2303.07200 Vol. 666, https://doi.org/10.48550/arXiv.2303.07200 (2023).
Tijjani, S., Wahab, M. N. A. & Noor, M. H. M. An enhanced particle swarm optimization with position update for optimal feature selection. Expert Syst. Appl. 446, 123337. https://doi.org/10.1016/j.eswa.2024.123337 (2024).
Zhou, L., Pan, S., Wang, J. & Vasilakos, A. V. Machine learning on big data: Opportunities and challenges. Neurocomputing 237, 350–361. https://doi.org/10.1016/j.neucom.2017.01.026 (2017).
Lin, Q., Chen, X., Chen, C. & Garibaldi, J. M. Boundary-wise loss for medical image segmentation based on fuzzy rough sets. Inf. Sci. 661, 120183. https://doi.org/10.1016/j.ins.2024.120183 (2024).
Telikani, A., Gandomi, A. H. & Shahbahrami, A. A survey of evolutionary computation for association rule mining. Inf. Sci. 524, 318–352. https://doi.org/10.1016/j.ins.2020.02.073 (2020).
Azzam, S. M., Emam, O. & Abolaban, A. S. An improved differential evolution with sailfish optimizer (desfo) for handling feature selection problem. Sci. Rep. 14, 13517. https://doi.org/10.1038/s41598-024-63328-w (2024).
Zhang, A. et al. Hyperspectral band selection using crossover-based gravitational search algorithm. IET Image Proc. 13, 280–286. https://doi.org/10.1049/iet-ipr.2018.5362 (2019).
Ang, J. C., Mirzal, A., Haron, H. & Hamed, H. N. A. Supervised, unsupervised, and semi-supervised feature selection: A review on gene selection. IEEE/ACM Trans. Comput. Biol. Bioinf. 13, 971–989. https://doi.org/10.1109/TCBB.2015.2478454 (2015).
Shi, J., Zhang, X., Liu, X., Lei, Y. & Jeon, G. Multicriteria semi-supervised hyperspectral band selection based on evolutionary multitask optimization. Knowl.-Based Syst. 240, 107934. https://doi.org/10.1016/j.knosys.2021.107934 (2022).
Bhadra, T. & Bandyopadhyay, S. Supervised feature selection using integration of densest subgraph finding with floating forward-backward search. Inf. Sci. 566, 1–18. https://doi.org/10.1016/j.ins.2021.02.034 (2021).
Turky, A., Sabar, N. R., Dunstall, S. & Song, A. Hyper-heuristic local search for combinatorial optimisation problems. Knowl.-Based Syst. 205, 106264. https://doi.org/10.1016/j.knosys.2020.106264 (2020).
Nssibi, M., Manita, G. & Korbaa, O. Advances in nature−inspired metaheuristic optimization for feature selection problem: A comprehensive survey. Comput. Sci. Rev. 49, 100559. https://doi.org/10.1016/j.cosrev.2023.100559 (2023).
Zhou, J. & Hua, Z. A correlation guided genetic algorithm and its application to feature selection. Appl. Soft Comput. 123, 108964. https://doi.org/10.1016/j.asoc.2022.108964 (2022).
Fang, Y., Yao, Y., Lin, X., Wang, J. & Zhai, H. A feature selection based on genetic algorithm for intrusion detection of industrial control systems. Comput. Secur. 139, 103675. https://doi.org/10.1016/j.cose.2023.103675 (2024).
Nadimi-Shahraki, M. H., Taghian, S., Mirjalili, S. & Abualigah, L. Binary aquila optimizer for selecting effective features from medical data: A covid-19 case study. Mathematics 10, 1929. https://doi.org/10.3390/math10111929 (2022).
Wan, Y., Wang, M., Ye, Z. & Lai, X. A feature selection method based on modified binary coded ant colony optimization algorithm. Appl. Soft Comput. 49, 248–258. https://doi.org/10.1016/j.asoc.2016.08.011 (2016).
Kale, G. A. & Yüzgeç, U. Advanced strategies on update mechanism of sine cosine optimization algorithm for feature selection in classification problems. Eng. Appl. Artif. Intell. 107, 104506. https://doi.org/10.1016/j.engappai.2021.104506 (2022).
Abed-Alguni, B. H., Alawad, N. A., Al-Betar, M. A. & Paul, D. Opposition-based sine cosine optimizer utilizing refraction learning and variable neighborhood search for feature selection. Appl. Intell. 53, 13224–13260. https://doi.org/10.1007/s10489-022-04201-z (2023).
Riyahi, M., Rafsanjani, M. K., Gupta, B. B. & Alhalabi, W. Multiobjective whale optimization algorithm-based feature selection for intelligent systems. Int. J. Intell. Syst. 37, 9037–9054. https://doi.org/10.1002/int.22979 (2022).
Amoozegar, M. & Minaei-Bidgoli, B. Optimizing multi-objective pso based feature selection method using a feature elitism mechanism. Expert Syst. Appl. 113, 499–514. https://doi.org/10.1016/j.eswa.2018.07.013 (2018).
Gao, J. et al. Information gain ratio-based subfeature grouping empowers particle swarm optimization for feature selection. Knowl.-Based Syst. 286, 111380. https://doi.org/10.1016/j.knosys.2024.111380 (2024).
Mirjalili, S., Mirjalili, S. M. & Lewis, A. Grey wolf optimizer. Adv. Eng. Softw. 69, 46–61. https://doi.org/10.1016/j.advengsoft.2013.12.007 (2014).
Nadimi-Shahraki, M. H., Taghian, S. & Mirjalili, S. An improved grey wolf optimizer for solving engineering problems. Expert Syst. Appl. 166, 113917. https://doi.org/10.1016/j.eswa.2020.113917 (2021).
Yang, G. et al. A modified gray wolf optimizer-based negative selection algorithm for network anomaly detection. International Journal of Intelligent Systems2023, https://doi.org/10.1155/2023/8980876 (2023).
Wang, M., Liu, W., Chen, M., Huang, X. & Han, W. A band selection approach based on a modified gray wolf optimizer and weight updating of bands for hyperspectral image. Appl. Soft Comput. 112, 107805. https://doi.org/10.1016/j.asoc.2021.107805 (2021).
Cheng, X., Li, J., Zheng, C., Zhang, J. & Zhao, M. An improved PSO-GWO algorithm with chaos and adaptive inertial weight for robot path planning. Front. Neurorobot. 15, 770361. https://doi.org/10.3389/fnbot.2021.770361 (2021).
Liu, J., Wei, X. & Huang, H. An improved grey wolf optimization algorithm and its application in path planning. IEEE Access 9, 121944–121956. https://doi.org/10.1109/ACCESS.2021.3108973 (2021).
Pan, H., Chen, S. & Xiong, H. A high-dimensional feature selection method based on modified gray wolf optimization. Appl. Soft Comput. 135, 110031. https://doi.org/10.1016/j.asoc.2023.110031 (2023).
Abdel-Basset, M., El-Shahat, D., El-Henawy, I., De Albuquerque, V. H. C. & Mirjalili, S. A new fusion of grey wolf optimizer algorithm with a two-phase mutation for feature selection. Expert Syst. Appl. 139, 112824. https://doi.org/10.1016/j.eswa.2019.112824 (2020).
Adhikary, J. & Acharyya, S. Randomized balanced grey wolf optimizer (RBGWO) for solving real life optimization problems. Appl. Soft Comput. 117, 108429. https://doi.org/10.1016/j.asoc.2022.108429 (2022).
Premkumar, M. et al. Augmented weighted k-means grey wolf optimizer: An enhanced metaheuristic algorithm for data clustering problems. Sci. Rep. 14, 5434. https://doi.org/10.1038/s41598-024-55619-z (2024).
Bilal, A. et al. Breast cancer diagnosis using support vector machine optimized by improved quantum inspired grey wolf optimization. Sci. Rep. 14, 10714. https://doi.org/10.1038/s41598-024-61322-w (2024).
Mafarja, M. et al. An efficient high-dimensional feature selection approach driven by enhanced multi-strategy grey wolf optimizer for biological data classification. Neural Comput. Appl. 35, 1749–1775. https://doi.org/10.1007/s00521-022-07836-8 (2023).
Pirgazi, J., Alimoradi, M., Esmaeili Abharian, T. & Olyaee, M. H. An efficient hybrid filter-wrapper metaheuristic-based gene selection method for high dimensional datasets. Sci. Rep. 9, 18580. https://doi.org/10.1038/s41598-019-54987-1 (2019).
Ye, Z., Ma, L. & Chen, H. A hybrid rice optimization algorithm. In 2016 11th International Conference on Computer Science & Education (ICCSE), 169–174 (IEEE, 2016). https://doi.org/10.1109/ICCSE.2016.7581575.
Shu, Z. et al. A modified hybrid rice optimization algorithm for solving 0–1 knapsack problem. Appl. Intell. 52, 5751–5769. https://doi.org/10.1007/s10489-021-02717-4 (2022).
Ye, Z. et al. A band selection approach for hyperspectral image based on a modified hybrid rice optimization algorithm. Symmetry 14, 1293. https://doi.org/10.3390/sym14071293 (2022).
Mirza, O. M. et al. Computer aided diagnosis for gastrointestinal cancer classification using hybrid rice optimization with deep learning. IEEE Access[SPACE]https://doi.org/10.1109/ACCESS.2023.3297441 (2023).
Ye, Z., Luo, J., Zhou, W., Wang, M. & He, Q. An ensemble framework with improved hybrid breeding optimization-based feature selection for intrusion detection. Future Gener. Comput. Syst.[SPACE]https://doi.org/10.1016/j.future.2023.09.035 (2023).
Ye, A. Z. et al. High-dimensional feature selection based on improved binary ant colony optimization combined with hybrid rice optimization algorithm. Int. J. Intell. Syst.[SPACE]https://doi.org/10.1155/2023/1444938 (2023).
Kennedy, J. & Eberhart, R. Particle swarm optimization. In Proceedings of ICNN’95-International Conference on Neural Networks, vol. 4, 1942–1948 (IEEE, 1995). https://doi.org/10.1109/ICNN.1995.488968.
Hafez, A. I., Zawbaa, H. M., Emary, E. & Hassanien, A. E. Sine cosine optimization algorithm for feature selection. In 2016 International Symposium on Innovations in Intelligent Systems and Applications (INISTA), 1–5 (IEEE, 2016). https://doi.org/10.1109/INISTA.2016.7571853.
Mirjalili, S. & Lewis, A. The whale optimization algorithm. Adv. Eng. Softw. 95, 51–67. https://doi.org/10.1016/j.advengsoft.2016.01.008 (2016).
Zhou, G., Li, K., Wan, G. & Ji, H. Feature selection algorithm based on multi strategy grey wolf optimizer. In International Conference on Intelligent Information Processing, 35–45, (Springer, 2020). https://doi.org/10.1007/978-3-030-46931-3_4.
Amiri, M. H., Mehrabi Hashjin, N., Montazeri, M., Mirjalili, S. & Khodadadi, N. Hippopotamus optimization algorithm: A novel nature−inspired optimization algorithm. Sci. Rep. 14, 5032. https://doi.org/10.1038/s41598-024-54910-3 (2024).
Ghasemi, M. et al. Optimization based on the smart behavior of plants with its engineering applications: Ivy algorithm. Knowl.-Based Syst. 295, 111850. https://doi.org/10.1016/j.knosys.2024.111850 (2024).
Ganjei, M. A. & Boostani, R. A hybrid feature selection scheme for high-dimensional data. Eng. Appl. Artif. Intell. 113, 104894. https://doi.org/10.1016/j.engappai.2022.104894 (2022).
Moslemi, A. A tutorial-based survey on feature selection: Recent advancements on feature selection. Eng. Appl. Artif. Intell. 126, 107136. https://doi.org/10.1016/j.engappai.2023.107136 (2023).
Ali, W. & Saeed, F. Hybrid filter and genetic algorithm-based feature selection for improving cancer classification in high-dimensional microarray data. Processes 11, 562. https://doi.org/10.3390/pr11020562 (2023).
Ye, Z. et al. Elite GA-based feature selection of LSTM for earthquake prediction. J. Supercomput.[SPACE]https://doi.org/10.1007/s11227-024-06218-2 (2024).
Zamani, H. & Nadimi-Shahraki, M. H. An evolutionary crow search algorithm equipped with interactive memory mechanism to optimize artificial neural network for disease diagnosis. Biomed. Signal Process. Control 90, 105879. https://doi.org/10.1016/j.bspc.2023.105879 (2024).
Salgotra, R. & Gandomi, A. H. A novel multi-hybrid differential evolution algorithm for optimization of frame structures. Sci. Rep. 14, 4877. https://doi.org/10.1038/s41598-024-54384-3 (2024).
Sun, L. et al. Feature selection using binary monarch butterfly optimization. Appl. Intell. 53, 706–727. https://doi.org/10.1007/s10489-022-03554-9 (2023).
Xie, W., Wang, L., Yu, K., Shi, T. & Li, W. Improved multi-layer binary firefly algorithm for optimizing feature selection and classification of microarray data. Biomed. Signal Process. Control 79, 104080. https://doi.org/10.1016/j.bspc.2022.104080 (2023).
Jiang, L., Greenwood, C. M., Yao, W. & Li, L. Bayesian hyper-lasso classification for feature selection with application to endometrial cancer RNA-seq data. Sci. Rep. 10, 9747. https://doi.org/10.1038/s41598-020-66466-z (2020).
Moslemi, A. et al. Classifying future healthcare utilization in COPD using quantitative CT lung imaging and two-step feature selection via sparse subspace learning with the cancold study. Acad. Radiol.[SPACE]https://doi.org/10.1016/j.acra.2024.03.030 (2024).
Wang, Y., Ran, S. & Wang, G.-G. Role−oriented binary grey wolf optimizer using foraging-following and Lévy flight for feature selection. Appl. Math. Model. 126, 310–326. https://doi.org/10.1016/j.apm.2023.08.043 (2024).
Zhang, L., Shan, L. & Wang, J. Optimal feature selection using distance−based discrete firefly algorithm with mutual information criterion. Neural Comput. Appl. 28, 2795–2808. https://doi.org/10.1007/s00521-016-2204-0 (2017).
Salgotra, R., Singh, U. & Singh, G. Improving the adaptive properties of lshade algorithm for global optimization. In 2019 International Conference on Automation, Computational and Technology Management (ICACTM), 400–407 (IEEE, 2019).
Yan, C. et al. A novel hybrid filter/wrapper feature selection approach based on improved fruit fly optimization algorithm and chi-square test for high dimensional microarray data. Curr. Bioinform. 16, 63–79. https://doi.org/10.2174/1574893615666200324125535 (2021).
Acknowledgements
The authors wish to thank NSFC-http://www.nsfc.gov.cn/ for their support through Grant Number 62376089, 62202147, 62302154, and 42201464, and the Key Research and Development Program of Hubei Province for the support through Grants Number 2023BEB024.
Author information
Authors and Affiliations
Contributions
Conceptualization: Ruoxuan Huang; Methodology: Ruoxuan Huang, Zhiwei Ye, Wen Zhou; Formal analysis and investigation: Zhiwei Ye, Ruoxuan Huang, Wen Zhou; Writing - original draft preparation: Ruoxuan Huang; Writing - review and editing: Zhiwei Ye, Ruoxuan Huang, Wen Zhou, Mingwei Wang, Ting Cai, Qiyi He; Funding acquisition: Zhiwei Ye, Wen Zhou, Ting Cai, Qiyi He, Peng Zhang, Yuquan Zhang; Resources: Zhiwei Ye, Wen Zhou, Mingwei Wang, Ting Cai, Qiyi He; Supervision: Peng Zhang, Yuquan Zhang.
Corresponding author
Ethics declarations
Conflict of interest
All authors declare that they have no conflicts of interest.
Reprints and permissions information
is available at www.nature.com/reprints.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Ye, Z., Huang, R., Zhou, W. et al. Hybrid rice optimization algorithm inspired grey wolf optimizer for high-dimensional feature selection. Sci Rep 14, 30741 (2024). https://doi.org/10.1038/s41598-024-80648-z
Received:
Accepted:
Published:
Version of record:
DOI: https://doi.org/10.1038/s41598-024-80648-z
This article is cited by
-
A novel hybrid feature selection method combining binary grey wolf optimization and cuckoo search
Scientific Reports (2025)
-
An improved Grey Wolf Optimizer based on mutation operator, evolutionary population dynamics, and nonlinear population size reduction strategy
Scientific Reports (2025)
-
A dual-enhanced long short-term memory earthquake prediction method based on improved and hybrid rice-inspired gray wolf optimizers
The Journal of Supercomputing (2025)














