Abstract
Crayfish Optimization Algorithm (COA) suffers from degradation of diversity, insufficient exploratory capability, a propensity to become caught in local optima, and an imprecise search engine for optimization. To address these issues, the current research introduces a hybrid strategy enhanced crayfish optimization algorithm (MSCOA). Initially, a chaotic inverse exploration initialization method is utilized to establish the population’s position with high diversity, significantly enhancing the global exploration capability. Second, an adaptive t-distributed feeding strategy was employed to define the connection between feeding behavior and temperature, increasing population variety and enhanced the algorithm’s local search effectiveness. Finally, an adaptive ternary optimization mechanism is introduced in the exploration phase: a curve growth acceleration factor is used to collaboratively guide global and individual optimal information, while a hybrid adaptive cosine exponential weigh dynamically adjusts the search intensity. Additionally, an inverse worst individual variant reinforcement approach is employed to enhance the population evolution efficiency. In the hybrid test sets of CEC2005 and CEC2019, MSCOA shows improved convergence accuracy compared to the traditional COA algorithm, and the Wilcoxon test (p < 0.05) confirms its superiority over five other comparison algorithms. MSCOA outperforms other algorithms in terms of robustness, convergence speed, and solution accuracy, although there is still room for further improvement. When combined with Extreme Learning Machine (ELM) and applied to the Wisconsin breast cancer dataset, the MSCOA-ELM model achieved 100% accuracy and F1 score, a 28.9% improvement over the baseline ELM, demonstrating the algorithm’s efficiency and generalization ability in solving practical optimization problems.
Similar content being viewed by others
Introduction
Optimization is widely regarded as the most efficient method for determining the best solution to various real-world problems in complex domains. All scientific fields face the constant challenge of optimization, and hundreds of Metaheuristic Algorithms (MA) have been devised to handle the growing complexity of optimization issues. A wide class of algorithms called MA is used to tackle challenging optimization issues. Numerous domains, including engineering design1, machine learning2, the medical profession3, community detection4, atmospheric sciences5, and others, heavily rely on these metaheuristic techniques. MAs are often categorized into many groups according to where they got their inspiration: evolutionary algorithms6, swarm intelligence (SI)7, physics-based algorithms8, and human behavior-based algorithms9. Among them, SI is particularly popular in scientific research and practical applications due to the simplicity and efficiency of its mathematical model.
SI is a type of stochastic optimization algorithms that tackle complicated optimization issues by mimicking the collective behavioral traits of a natural collection of organisms, such as Spider wasp optimizer10, Genghis khan shark optimizer11, Puma optimizer12, Dung beetle optimizer13, Gazelle optimization algorithm14, Grey wolf optimizer15 and Frilled Lizard Optimization16 etc. These algorithms are used in computer science17, engineering science18, Mathematics19, Environmental Sciences20, biomedical science21 and many other fields have demonstrated powerful problem solving capabilities.
In recent years, the integration of metaheuristic algorithms and machine learning has achieved remarkable progress in various fields. Researchers have explored how metaheuristic optimization algorithms can enhance machine learning models. Jovanovic et al.22 proposed a hybrid approach combining machine learning and swarm intelligence optimization to tackle the problem of credit card fraud detection. Han et al.23 utilized a hybrid method based on Convolutional Neural Networks (CNN) and Extreme Learning Machine (ELM) for the optimal and efficient modeling of Proton-exchange Membrane Fuel Cells (PEMFC), further optimizing the model using an improved Honey Badger Algorithm (IHBA) to achieve optimal results. Ewees et al.24 employed the Honey Badger Optimization (HBO) algorithm to optimize Long Short-Term Memory (LSTM) networks, boosting wind power forecasting performance. Additionally, Ma et al.25 combined the Multi-Verse Optimization (MVO) algorithm with Support Vector Regression (SVR) for geohazard modeling, achieving great success. Jovanovic et al.26 tuned the multi-headed LSTM structure using a modified Particle Swarm Optimization (PSO) algorithm to enhance fuel price forecasting accuracy. These studies highlight that the fusion of metaheuristic optimization and machine learning can optimize model structures and improve prediction and classification performance. In the future, this research domain holds great potential, particularly in intelligent scheduling optimization, automated hyperparameter tuning, and deep learning-driven adaptive search strategies, which could further strengthen the synergy between optimization algorithms and machine learning.
The new swarm intelligence algorithm Crayfish Optimization Algorithm (COA)27 was put out by Jia et al. in 2023, inspired by crayfish behaviors such as burrow scrambling, foraging, and heat avoidance. COA mimics crayfish behaviors through a two-phase strategy: in the exploration phase, it mimics crayfish searching for habitats to enhance global search ability, while in the exploitation phase, it mimics burrow scrambling and foraging behaviors to achieve local optimization. The algorithm is dynamically adjusted based on temperature changes, with crayfish searching for burrows to avoid the heat when the temperature exceeds 30 °C and foraging when it falls below 30°C.
Motivations behind the present work
Although COA possesses numerous advantages, it also has several shortcomings:
-
Decline in population diversity.
-
Tendency to fall into local optima.
-
Insufficient exploration capability.
-
Lack of adaptability due to static parameters.
Contribution of this study
To efficiently address these limitations of COA, this study proposes a hybrid strategy enhanced crayfish optimization algorithm (MSCOA). To evaluate its performance, MSCOA is combined with Extreme Learning Machine (ELM) and applied to breast cancer prediction. The key contributions of this study are as follows:
-
The chaotic inverse exploration initialization strategy is proposed to initialize the population position by introducing Logistic-Tent chaotic mapping with variational quasi-inverse learning to generate more diverse and uniformly distributed solutions and increase the algorithm’s effectiveness for global search.
-
The adaptive t-distributed feeding strategy is used to describe the relationship between feeding amount and temperature, which effectively increases the algorithm’s potential to search locally and successfully expands the possibility for adaption in the simulated real-world environment.
-
Curve growth acceleration factor \(\:{\text{C}}_{1}\) is proposed to optimize burrow positioning, ensuring the integration of individual and global optimal solutions while making better use of population information.
-
Hybrid adaptive cosine exponential weights \(\:{\upomega\:}\) are introduced into the crayfish position update formula, allowing the algorithm to dynamically adjust search intensity and balance global exploration and local convergence.
-
Inverse worst individual variance strengthening strategy applies inverse perturbation to poor solutions, guiding the population out of local optima.
-
A novel method, the adaptive ternary optimization mechanism, is proposed by integrating \(\:{\text{C}}_{1}\), \(\:{\upomega\:}\), and the inverse worst individual variance strengthening strategy into the position update process during the Summer Resort Stage.
-
The MSCOA-ELM model is developed and applied to breast cancer prediction, demonstrating the algorithm’s potential for practical applications.
To clearly highlight the main contributions of MSCOA, they are summarized in the following numbered list:
-
1.
Introduction of a chaotic inverse exploration initialization strategy.
-
2.
Implementation of an adaptive t-distributed feeding strategy.
-
3.
Proposal of a curve growth acceleration factor for optimized burrow positioning.
-
4.
Integration of hybrid adaptive cosine exponential weights in position updates.
-
5.
Development of an inverse worst individual variance strengthening strategy.
-
6.
Creation of a novel adaptive ternary optimization mechanism.
-
7.
Application of MSCOA-ELM to breast cancer prediction, demonstrating real-world utility.
This paper’s fundamental format is as follows: Sect. Introduction provides a succinct synopsis of the topics and issues discussed throughout the paper. Section Motivations behind the present work introduces previous research on COA improvements and its practical applications. Section Contribution of this study provides a comprehensive summary of COA. Section 4 proposes MSCOA, detailing the algorithmic flow and the specifics of the MSCOA algorithm. Section 5 presents the experimental results and conducts analysis and discussion. Section 6 integrates MSCOA with ELM (MSCOA-ELM) for breast cancer prediction. Finally, Sect. 7 summarizes the paper.
LITERATURE REVIEW
Owing to its quickness, flexibility, and adaptability, COA been applied across a wide range of fields. For instance, Hussam et al.28 developed a hybrid COASaDE optimizer for handling difficult optimization issues and engineering design issues. In order to address FS issues, Shaymaa et al.29 presented a successful IBCOA approach. Sait et al.30 explored the application of COA in engineering design optimization with the assistance of Artificial Neural Network, and Shikoun et al.31 offer a novel BinCOA for feature selection.
COA has also been used for PV parameter extraction32, cybersecurity33, PV parameter estimation34, skidney tumor segmentation and classification35, and friction behavior modeling in complex machinery36.
Moreover, Yakubu et al.37 used COA to generate an optimal resource allocation in both the fog and cloud layers that reduces the delay and execution time of the system. Jia et al.38 used the COA approach to deftly find the crucial hyperparameters of the model. Jiang et al.39 showcased the BiLSTM approach for time-series forecasting, optimized by the enhanced crayfish optimization algorithm (ICOA). Huang et al.40 suggest an innovative fault diagnostic methodology for distribution transformers and COA is used to optimize parameters. Almutairi et al.41 leveraged a hierarchical optimization strategy that integrates particle swarm optimization with crayfish optimization for more effective solutions.
Nevertheless, COA’s performance in complex optimization tasks is hindered by challenges such as local optima, diminished population diversity, and insufficient exploration capacity during the optimization process.
To mitigate these issues, Zhong et al.42 proposed HRCOA, which combines ROA’s exploitation operators with COA’s summer resort operators to streamline the algorithm. However, this method struggles with inefficiencies in high-dimensional problems. In an attempt to improve the algorithm’s performance, Zhang et al.43 presented ECOA, which introduces four enhancement measure, yet its convergence accuracy remains inadequate for complex multimodal functions. Jia et al.44 presented MCOA, a mechanism for updating the environment based on the crayfish’s survival habit. The MCOA uses a ghost-based antagonistic learning strategy along with the water quality factor to guide the crayfish in its search for better environments. However, this approach still has limitations in balancing global exploration and local exploitation. Finally, Wang et al.45 suggested IMCOA to address the disconnect between global exploration and local exploitation phases in COA, particularly during the summer vacation and tournament phases, but it fails to completely eliminate premature convergence.
Crayfish optimization algorithm
Crayfish (scientific name: Procambarus clarkii) is a kind of freshwater crustacean46 that is indigenous to North America, which has been brought to several regions of the world due to its adaptability. Temperature significantly impacts its survival47, influencing behavior such as burrowing to avoid heat or foraging when conditions are favorable.
As omnivores, crayfish may eat a wide range of foods, including grains, insects, and plant material48. When Crayfish feed, they usually claw large food items, tear it off, and send it to the second and third walking feet for grasping and nibbling. For small amounts of food, their second and third walking feet are used to grab and bite directly. And crayfish typically utilize pincers or fast concealment as a defensive tactic to keep other crayfish from stealing them49. As seen in Fig. 1, The behavior of freshwater crayfish when foraging, avoiding summer heat, and competing is the inspiration’s origin for the COA, which are applied in the algorithm’s exploration and development phases.
Temperature and feeding
Changes in environmental temperature cause crayfish to exhibit different behavioral stages. Temperature is defined as Eq. (1). Within a suitable temperature range (20°C to 35 °C), crayfish engage in foraging behavior and their feeding amount is regulated by temperature, showing an approximate normal distribution trend. These are the particular equations:
where \(\:\mu\:\) indicates the most ideal outside temperature for crayfish foraging. By adjusting \(\:{C}_{1}\) and \(\:\sigma\:\), the quantity of food that crayfish eat at various temperatures can be controlled.
Summer resort stage
When external temperatures exceed 30 °C, crayfish instinctively retreat into their burrows to escape the heat. The expression for the burrow position is denoted as Eq. (3).
where \(\:{X}_{L}\) stands for the ideal spot in the existing population and \(\:{X}_{G}\) signifies the optimal position attained throughout the iterative process. The rivalry between crayfish for burrows is arbitrary. Equation (3) states that when rand\(\:<\)0.5, the crayfish straight to escape the heat because there aren’t any other crawfish battle for it.
where t is the amount of cycles that are currently in progress, \(\:{X}_{i,j}^{t}\:\)symbolizes the j-th dimensional position of the i-th crayfish in the t-th iteration, and \(\:{C}_{2}\) is a coefficient that decreases linearly from 2 to 0. This is how it is calculated.
Competition stage
The presence of rand\(\:>\)0.5 and an ambient temperature over 30 °C suggests that other crayfish are also considering the burrow. The crayfish now fight over who gets to keep the burrow. Every crayfish X will modify its location based on where another crayfish X is located. The formula is as follows.
where B is the crayfish population size and z denotes another randomly obtained crayfish individual with the following Eq.
Foraging stage
Crayfish will approach the food source if the outside temperature is at or below 30°C. When they locate the food, they assess it. If the food is too large, they will tear it apart using their claws and alternate between their second and third legs to feed. However, if the food is of moderate size, the crayfish will directly approach and consume it. The food’s position is denoted as \(\:{\text{X}}_{\text{f}\text{o}\text{o}\text{d}}\), which is represented as \(\:{\text{X}}_{\text{f}\text{o}\text{o}\text{d}}={\text{X}}_{\text{G}}\). The food’s size Q can be described by Eq. (8).
where, \(\:{\text{C}}_{3}\) denotes the food factor, which is a constant value indicating the largest food size, with a specific value of 3, \(\:{\text{f}\text{i}\text{t}\text{n}\text{e}\text{s}\text{s}}_{\text{i}}\) denotes the fitness value of the i-th crayfish and \(\:{\text{f}\text{i}\text{t}\text{n}\text{e}\text{s}\text{s}}_{\text{f}\text{o}\text{o}\text{d}}\) denotes the fitness value of the food.
The food is considered excessively large when the food size Q\(\:>\)(\(\:{\text{C}}_{3}\)+1)/2. At this time, the crayfish begins tearing it with the first claw. The following is the mathematical equation:
The food is alternately picked up and brought to the mouth using the second and third paws as it is shredded, becoming smaller. Equation (10) combines sine and cosine functions to simulate the alternating motion of the second and third claws.
When the food size \(\:Q\le\:\left({\text{C}}_{3}+1\right)/2\), the crayfish simply travels towards the food position and feeds, as Eq. (11) illustrates.
Hybrid strategy enhanced crayfish optimization algorithm
To solve the problem of COA, this paper proposes MSCOA. To address the issue of insufficient population diversity in COA, where traditional random initialization leads to uneven solution distribution and low global search efficiency, this study proposes a Chaotic Reverse Initialization Strategy. To overcome the fixed step size in the COA feeding phase, which limits local search capability and makes static parameters less adaptable to dynamic optimization environments, this paper introduces the t-Distribution Feeding Strategy. To tackle the problem of excessive reliance on the current best solution, which can easily lead to local optima and premature convergence, we propose the Inverse Elite Variant Reinforcement Strategy. Lastly, to address the lack of a dynamic weight mechanism in COA, which hinders the flexible adjustment of search intensity, this paper introduces the Curve Growth Bootstrap Factor and Hybrid Adaptive Cosine Exponential Weighting.
Chaotic reverse initialization strategy
In intelligent optimization algorithms, the quality of the original population has a non-negligible effect on the algorithm’s performance, and can directly impact both the convergence rate and global search capability50. The random initialization adopted by the basic COA algorithm may lead to the distribution of the initial population being too scattered or concentrated, which lacks effective guidance of the search space. An inadequately distributed population may prematurely converge to local optima, reducing the overall search effectiveness. To address this, a chaotic inverse initialization strategy is adopted to optimize the population initialization process. This is achieved by incorporating Logistic-Tent chaotic mapping along with quasi-inverse learning, enhancing the algorithm’s global search efficiency.
Logistic-Tent chaotic mapping
Good randomness, traversability, and non-repetition characterize chaotic motion. Chaotic mapping has been a popular technique in optimization algorithms in recent years. The algorithm’s performance may be further enhanced by incorporating chaotic behavior, which can further increase the population variety and global search efficiency51.
Logistic mapping is a kind of single-peaked mapping, which fully carries over the benefits of unpredictability and sensitivity to beginning values that are present in chaotic systems. This is how it is expressed mathematically:
where \(\:{X}_{n}\) is the outcome of the iteration \(\:{X}_{n}\in\:\left(\text{0,1}\right)\), and a is a parameter \(\:a\in\:\left(\text{0,4}\right)\).
The Tent chaotic system, another one-dimensional mapping, is known for its uniform probability distribution and strong autocorrelation. Its mathematical form is as follows:
where \(\:{X}_{n}\) is the iteration result \(\:{X}_{n}\in\:\left(\text{0,1}\right)\), and a is a control parameter \(\:b\in\:\left(\text{0,4}\right)\) and it is \(\:b\ne\:0.5\) when the system is chaotic.
By combining these two chaotic systems, Zhu et al.52 produced a Logistic-Tent composite chaotic system. The benefits of both Tent and Logistic mapping are combined in the Logistic-Tent chaotic mapping, which promotes the stability and diversity of chaotic behavior. diversity of chaotic behavior. It is capable of generating highly complex and unpredictable sequences with good traversal properties. Here is the Logistic-Tent mapping expressed mathematically:
where \(\:{X}_{n}\) is the iteration result \(\:{X}_{n}\in\:\left(\text{0,1}\right)\), and k is a control parameter \(\:k\in\:\left(\text{0,4}\right)\).
In Fig. 2, the histograms of these three chaotic mappings after 500 iterations are shown, where it is observed that the Logistic-Tent mapping has a more uniform distribution between [0,1] than the other two typical mappings. This means that using Logistic-Tent mapping for population initialization can more thoroughly cover the search space and raise the likelihood of identifying the global optimum. Secondly, Lyapunov exponent is an important measure of chaotic performance, and this index is maximized to 1.2937 for Logistic-Tent cascade chaotic sequences, compared to 0.6926 for Tent chaotic sequences and 0.6930 for Logistic chaotic sequences53. Obviously Logistic-Tent chaotic mapping has better chaotic properties.
Quasi-reverse learning
Compared with independent randomly generated candidate solutions, the reverse solution is beneficial and more conducive to the propensity optimal solution54. Therefore, in this work, the chaotic initialized population obtained by mapping the Logistic-Tent chaotic sequence into the solution space is subjected to quasi-reverse learning to further optimize the initial population and enhance the likelihood of receiving an excellent initial solution.
Opposition-based learning (OBL) was established by Tizhoosh55, generating opposite individuals in the current individual region, comparing and selecting the individuals with high adaptability for subsequent iterations, which can effectively enhance the population’s quality and variety, and is conducive to optimizing its search effect. The following is the mathematical formula for reverse learning:
where LB and UB are the bottom and upper boundaries of the search space, respectively.
Quasi-opposition-based learning (QOBL) is one kind of OBL56, and previous studies have demonstrated that using quasi-opposition-based solutions is more effective than inverse solutions in finding the global optimum. Quasi-opposition-based learning obtains the function of the population as follows:
where \(\:X\left(n\right)\) is the chaotic initialized population after the Logistic-Tent chaotic mapping of Eq. (13), and LB, UB represent the chaotic initialized population’s top and lower boundaries, respectively. \(\:{X}_{QOBL}\left(n\right)\) is the new population produced via quasi-reverse learning. MP is illustrated in Eq. (17).
CS as shown in equation Eq. (18).
Flowchart of chaotic inverse exploration initialization strategy
As shown in Fig. 3, this strategy first generates a chaotic initialized population by initializing the population with Logistic-Tent chaotic mapping, and then generates its inverse solution using quasi-reverse learning, which can explore various areas of the issue space, avoiding the over-concentration of individuals in certain regions during the initialization, and reducing the likelihood of discovering the local optimum solution process at an early stage. Finally, the chaotic initialized population is merged with the inverse population, the fitness is calculated, and the optimal n individuals are chosen to become the new initial population. It can make the inverse solution complementary to the original solution and improve the coverage of the population, thus strengthening the optimization algorithm’s early-stage exploration capability.
t-Distribution feeding strategy
An adaptive t-distributed feeding strategy was employed to define the connection between feeding behavior and temperature. This approach increased population variety and enhanced the algorithm’s local search effectiveness.
The t-distribution, sometimes referred to as the student distribution, is introduced in this work. Equation (19) shows its probability density function, which has the degree of freedom parameter n. This study uses Eq. (19) to illustrate the link between the temperature and the feeding quantity. The t-distribution combines the features of the Gaussian and Cauchy distributions. Figure 4 compares the pictures of the three distributions.
where \(\:{\uptau\:}\) is the Gamma function.
From Fig. 4, It is evident that early in the t-distribution iteration, the properties of the Cauchy distribution will be displayed, enhancing the population’s variety. At the middle and late stage of iteration, the value of the degree of freedom parameter is larger. Compared with the Gaussian distribution in the original text, the t-distribution approximates the Gaussian distribution, which increases the algorithm’s ability to evolve locally. And the t-distribution has a longer tail relative to the Gaussian distribution, which means it can better capture extreme values or outliers. For modeling data such as crayfish intake, which comes with a high degree of volatility, the t-distribution can be a more realistic reflection of the actual situation. Therefore, using an adaptive t-distribution feeding strategy maintains the algorithm’s stability and enhances the population’s variety.
Curve growth guidance factor
Curve growth acceleration factor is used to control the influence of the global and individual optimal solutions, thus raising the likelihood of discovering the globally best solution and enhancing search efficiency.
\(\:{X}_{shade}\) is an important guide to the evolution of crayfish populations. When the quantity of iterations rises, many particles may become trapped in local optima. To preserve population variety and enhance the algorithm’s capacity to escape these local optima, the global optimum’s effect on the solution search process must be increased. This adjustment helps make the search more stable and comprehensive.
To achieve this, we introduce two parameters, \(\:{C}_{1}\) and \(\:{C}_{2}\), as outlined in Eqs. (20) and (21), which respectively control the influence of the global and individual optimal solutions. These parameters allow for varying weight distributions at different stages of the iteration, increasing the search’s variety and unpredictability. The algorithm is able to avoid stagnating in local optimum solutions because to this flexibility.
Updated formula for the location of \(\:{\text{X}}_{\text{s}\text{h}\text{a}\text{d}\text{e}}\) updated to formula 23. By dynamically adjusting \(\:{\text{C}}_{1}\) and \(\:{\text{C}}_{2}\), the algorithm can strike a balance finding novel answers with taking use of already-found ones. Specifically, as \(\:{\text{C}}_{1}\) is incremented, \(\:{\text{C}}_{2}\) is decremented accordingly (Fig. 5 (a)), and this change ensures that information about the global and individual optimal solutions is combined to more fully utilize the information about the current population, thus improving the search efficiency and raising the likelihood of discovering the globally ideal answer. When the quantity of iterations rises, the values of \(\:{\text{C}}_{1}\) and \(\:{\text{C}}_{2}\) are automatically adjusted in the process, allowing the algorithm to focus on extensive exploration at the beginning and progressively shift to localized and refined development in the later stages. This dynamic adjustment mechanism makes a difference to boost the algorithm’s convergence speed and global search capability.
In order to make \(\:{\text{C}}_{1}\), \(\:{\text{C}}_{2}\) meet the requirements, after experiments, the parameters of Eq. (20) are set as a = 0.11; b = 1; c = 1; d = 0.11; f = 1; g = 1. As shown in Fig. 5 (b), Sensitivity analysis shows that:
-
when a = d = 0.11, the \(\:{\text{C}}_{1}\) curve exhibits a smooth transition, balancing exploration and exploitation, which leads to stable and effective search performance.
-
When a = d = 0.5, rises too early, leading to premature convergence to local optima, reducing global exploration capacity.
-
When a = d = 0.01, \(\:{\text{C}}_{1}\) increases slowly, resulting in low global search efficiency, thereby lowering the convergence efficiency.
-
Setting b = c = f = g = 1 simplifies the formula structure and reduces the complexity of parameter tuning, making the algorithm easier to implement while maintaining robust performance.
Experimental verification shows that setting a = d = 0.11 and b = c = f = g = 1 achieves the best balance between exploration and exploitation.
The dynamically weighted three-step displacement updating strategy updates the equation for the summer avoidance phase, in which crayfish enter the burrow directly, to Eq. (23) through three improvements in the curve growth bootstrap factor, the hybrid adaptive cosine exponential weighting, and the inverse elite variant reinforcement strategy.
where \(\:\omega\:\) is an innovative hybrid adaptive cosine exponential weight suggested in this paper as in Eq. (24). The figure below, Fig. 5, displays the amount of iterations.
where a = 5 and b = 0.1. As shown in Fig. 6 (a), if a is too small, the initial \(\:\omega\:\) becomes too large, resulting in insufficient local search at the early stage, which negatively impacts optimization quality. Conversely, if a is too large, the initial \(\:\omega\:\) is smaller, but since the denominator’s decay rate remains the same, \(\:\omega\:\) grows more slowly. While this improves early-stage accuracy, it weakens global search capability, making the algorithm prone to local optima. Additionally, Fig. 6 (b) shows that if b is too small, the denominator decays too slowly, causing \(\:\omega\:\) to remain at a low value for an extended period, leading to weak global search ability and slow convergence. On the other hand, if b is too large, the denominator decreases rapidly, causing \(\:\omega\:\) to reach its maximum too early. This results in premature convergence, preventing sufficient local optimization. Through the above comparisons, the combination of a = 5 and b = 0.1 provides adequate local search time in the early phase while ensuring a well-timed transition to global search. This prevents both premature convergence and prolonged stagnation. Therefore, this parameter selection is theoretically and experimentally justified.
As illustrated in Fig. 7, the weight factor exhibits a positive correlation with the number of iterations, t, indicating a nonlinear growth trend. Initially, there is a slow change both in the initial and final phases of the incremental procedure. when the algorithm first starts, the inertia weight is kept at a lower value for an extended period, which strengthens local optimization capability, thereby increasing the accuracy. In the intermediate and advanced phases, when the crayfish population exhibits higher convergence, the inertia weight increases with the number of iterations, stabilizing at a larger value for an extended duration. This adjustment fosters the global optimization potential of the crayfish population, improving the speed.
The design of the inertia weight integrates both cosine and exponential functions, which allows the algorithm to dynamically adjust its search intensity. This ensures a balanced approach, accommodating the global exploration needs as well as the local convergence requirements, effectively improving overall performance.
\(\:{X}_{worst}\) is the position after applying the Inverse Worst Elite Mutation Strategy update. In this paper, \(\:{X}_{shade}\) i.e., the global ideal solution as well as the local ideal solution, has been fully utilized, while the worst position is the neglected information. For the purpose of better utilizing the information provided by \(\:{X}_{worst}\), this paper proposes the Reverse Elite Mutation Reinforcement Strategy. After the worst individual\(\:{\:\text{X}}_{\text{w}\text{o}\text{r}\text{s}\text{t}}\) is selected, it is firstly subjected to quasi-reverse learning, through which the effective information in the individual at the worst position is maximized and the reverse elite solution is generated to guide the search process to approach best possible answer. Second, a nonlinear Cauchy-Gaussian variation strategy is applied to the worst position\(\:{\:X}_{worst}\). This method introduces random perturbations to the globally optimal individuals, preventing local optima from being reached by the algorithm. In this paper, based on the Cauchy-Gaussian variation strategy, we introduce hybrid adaptive cosine exponential weights \(\:{\upomega\:}\) to perturb the generated inverse elite solution with nonlinear Cauchy-Gaussian variation as in Eq. (25). Pre-iteration population is more dispersed, given a larger weight of the Cauchy distribution, is conducive to generating a large step size jumping out of the current position, to facilitate the search for better solutions. Late iteration of population is more aggregated, given a larger weight of the Gaussian distribution, is conducive to generating a small step size in the current position. It is beneficial to generate small step length to carry out local perturbation at the current position to avoid local optima and improve the algorithm’s convergence accuracy.
where, \(\:{X}_{oworst}\) is the individual of the reverse elite solution perturbed by the nonlinear Cauchy-Gaussian variational strategy and \(\:{X}_{oworst1}\) is the reverse elite solution after quasi-reverse learning. A random variable with a Gaussian distribution is referred to as Gaussian, while a random variable with a Cauchy distribution is referred to as Cauchy.
After solving the perturbation for the reverse elites, the greedy strategy is finally performed. Comparing the individuals before and after the mutation, the one with better adaptation is selected as the new individual. The selection of the optimal solution based on the greedy strategy is shown in Eq. (26).
where, \(\:fobj\left(\right)\) is the fitness function.
The variables \(\:{r}_{1}\) and \(\:{r}_{2}\) represent random numbers within the range [0,1], which enhance individual randomness, disrupt directional tendencies, and introduce greater evolutionary possibilities. By incorporating sine and cosine functions, individual positions are updated through various methods, effectively preventing premature convergence and preserving population diversity. The resilience and stability of the algorithm are greatly increased by this approach, thereby enabling a more balanced process of prospecting and extraction.
In the first phases of the algorithm, \(\:{X}_{shade}\) representing the global and local optimal solutions, is used to guide the overall position towards a balanced state. At a later stage, the reverse elite solution of \(\:{X}_{worst}\), there may be unexplored regions, and in order to make the value more comprehensive, the proportion of reverse elite solution of \(\:{X}_{worst}\) is gradually increased. Therefore, the formula introduces \(\:{C}_{1}\), \(\:{C}_{2}\), as in Eqs. (20) and (21). \(\:C1\) is gradually increased and \(\:{C}_{2}\) is taken to be gradually decreased.
MSCOA pseudo-code
Algorithm 1
Pseudo-code of the MSCOA.
Initialization using Eq. (14) and Eq. (16) to generate an initial population.
Calculate the fitness value of the population to obtain \(\:{\text{X}}_{\text{G}},\:\:{\text{X}}_{\text{L}},\:\:{\text{X}}_{\text{w}\text{o}\text{r}\text{s}\text{t}}\)
While t < T.
Defining temperature temp using Eq. (1)
Calculate the value of \(\:{\text{C}}_{1},{\text{C}}_{2},{\upomega\:}\),
Update \(\:{\text{X}}_{\text{w}\text{o}\text{r}\text{s}\text{t}}\) using Eq. (17)
for i = 1 to N.
If temp > 30.
Define cave \(\:{\text{X}}_{\text{s}\text{h}\text{a}\text{d}\text{e}}\) using Eq. (23)
If rand < 0.5.
Perform Summer Resort Stage.
Calculate \(\:{\text{X}}_{\text{o}\text{w}\text{o}\text{r}\text{s}\text{t}}\) using Eq. (26)
Update \(\:{\text{X}}_{\text{w}\text{o}\text{r}\text{s}\text{t}}\) according to Eq. (27)
Crayfish enters the summer resort stage based on Eq. (24)
Else
Perform Competition Stage.
Crayfish competes for caves using Eq. (7)
End.
Else.
Perform Foraging Stage.
Obtain food intake p and food size Q from Eq. (18) and Eq. (9)
If Q > 2
Else
Crayfish forages based on Eq. (12)
End.
End.
Update fitness values, \(\:{\text{X}}_{\text{G}},\:{\:\text{X}}_{\text{L}}\), \(\:{\text{X}}_{\text{w}\text{o}\text{r}\text{s}\text{t}}\)
t = t + 1.
End.
MSCOA flowchart
In Fig. 8, the workflow of MSCOA is shown. First, population initialization is performed, using Eq. (14) and Eq. (16) to set the crayfish population, calculate the fitness value, and obtain the global optimum \(\:{\text{X}}_{\text{G}}\), local optimum \(\:{\text{X}}_{\text{L}}\), and global worst position \(\:{\text{X}}_{\text{w}\text{o}\text{r}\text{s}\text{t}}\). Next, Eq. (16) is used for updating global worst position \(\:{\text{X}}_{\text{w}\text{o}\text{r}\text{s}\text{t}}\), and Eq. (1) is used to set the temperature temp, and to determine whether to enter into a high temperature environment (i.e., Temp > 30). When the temperature exceeds 30 degrees, the algorithm either proceeds to the summer resort or the competition stage, after defining \(\:{\text{X}}_{\text{s}\text{h}\text{a}\text{d}\text{e}}\) through Eq. (22). In the case where rand < 0.5, the algorithm simulates the crayfish’s behavior in a high-temperature environment, entering the summer resort stage. Equation (25) is used to calculate \(\:{\text{X}}_{\text{o}\text{w}\text{o}\text{r}\text{s}\text{t}}\) and Eq. (26) is used to update \(\:{\text{X}}_{\text{w}\text{o}\text{r}\text{s}\text{t}}\) Eq. (23) is used to adjust the position of crayfish. On the contrary, if rand > 0.5, crayfish will move to the competition stage and compete for burrows for better survival. This process is carried out by Eq. (6) for competition for burrows. When not exceeding 30 degrees, the food size Q is computed using Eq. (8), and the food intake p is calculated by Eq. (19), and based on the size of Q, the decision of which foraging behavior to take is made. If Q > 2, the crayfish shredded the food according to Eq. (9), followed by foraging behavior according to Eq. (10). When Q < 2, Eq. (11) is applied to perform the foraging behavior. In each round of iteration, the algorithm updates the global and local optimal solution \(\:{\text{X}}_{\text{G}\:}\)and \(\:{\text{X}}_{\text{L}}\), and gradually approximates the optimal solution based on the fitness value. The whole algorithm ends after the termination condition t > T is satisfied.
Compared with traditional COA, MSCOA introduces additional mechanisms to enhance the diversity of search and global exploration, avoiding trapping in local optimal solutions and improving performance in challenging optimization tasks.
Computational complexity analysis
The time complexity illustrates the algorithm’s convergence rate. Let T represent the number of iterations, n the population size, and dim the search space’s dimension. In the original COA, initialization complexity is \(\:O\left(n*dim\right)\), with each iteration being \(\:O\left(T*n*dim\right)\), leading to an overall complexity of \(\:O\left(T*n*dim\right)\).
In MSCOA, after applying chaotic inverse exploration during initialization, the complexity becomes: \(\:O\left(n*dim+n*log\left(n\right)\right)\). The most significant part is \(\:O\left(n*dim\right)\). In the iterative process, despite introducing a dynamically weighted three-step displacement updating strategy, the time complexity remains consistent with COA at \(\:O\left(T*n*dim\right)\).
Thus, MSCOA shares the same asymptotic time complexity as COA, \(\:\text{O}\left(\text{T}\text{*}\text{n}\text{*}\text{d}\text{i}\text{m}\right)\), with added computations from chaotic reverse initialization slightly impacting constants without altering the overall order.
Experimental Results and Discussions.
Three primary experiments are conducted in this part. The first main experiment is to test the enhanced strategy’s efficacy introduced in the algorithm, the second is evaluating MSCOA’s performance against a number of similarly comparable algorithms. The third one compares the suggested algorithm against other algorithms from the second experiment using a rank-sum method. This test may be used to determine whether MSCOA and other algorithms differ noticeably from one another. Additionally, the fourth experiment utilizes the Friedman test to assess the overall performance of these comparative algorithms.
-
A.
Experimental setup.
To thoroughly examine the optimization accuracy and conversion rate of the proposed algorithm, 17 standard test functions have been selected from the CEC2005 and CEC2019 test sets for comparative experiments, aiming to comprehensively analyze the performance of the optimization algorithm. The function names, expressions, search ranges, and operational solutions are shown in Table 1. The dataset is appropriate for assessing an algorithm’s performance in terms of local search, global search, resilience, and convergence ability since it contains a variety of optimization problem types, including unimodal, multimodal, hybrid, and high-dimensional issues.
Among them, F1 through F6 are unimodal functions, meaning they have a single global optimum. These functions are appropriate for evaluating the algorithm’s capacity for local searches and rate of convergence. Multimodal functions, such as F7–F12, contain several local optima but only one global optimum. These functions are primarily used to assess the algorithm’s capacity for both global search and escape from local optima. The hybrid composite functions F13–F15 have a very complicated search space and incorporate several distinct mathematical functions. The main purpose of these functions is to evaluate the optimization algorithm’s resilience and capacity for global search. High-dimensional multimodal functions, such as F16 and F17, have increased dimensionality and several local optima. They are employed to evaluate the algorithm’s capacity for exploration and exploitation abilities in high-dimensional complex environments.
Different categories of test functions are used to evaluate the optimization algorithm’s performance in terms of convergence speed, global search ability, ability to avoid local optima, robustness, and high-dimensional optimization capability. To obtain unbiased results, in all comparisons, the overall size and maximum number of iterations for all functions are set to 30 and 1000, respectively.
All experiments were run on a PC equipped with an Intel (R) Core (TM) i5-12500 H CPU, 16.00 GB RAM, and Windows 11 operating system, and all algorithms were implemented using MATLAB R2022a.
Comparison with different improvement strategies
MSCOA is compared with the crayfish optimization algorithm (COA1) that only applies the chaotic inverse exploration initialization strategy, the crayfish optimization algorithm (COA2) that only employs the adaptive t-distribution feeding strategy, the crayfish optimization algorithm (COA3) that only employs the curve-growth bootstrap factor, and the crayfish optimization algorithm (COA4) that only implements the dynamically-weighted three-step displacement update strategy, and the.
parameter settings are as the same as in this paper, to verify the superiority of MSCOA. To better illustrate the impact of each single strategy on the COA more clearly, the convergence curves of various algorithms on the functions F1-F17 are given in Fig. 9.
In Fig. 9, F1–F6 demonstrate that each strategy enhances convergence speed to varying degrees, with MSCOA consistently achieving faster convergence to lower fitness values compared to the original COA and its variants. This indicates that MSCOA excels in local search capabilities, allowing for more efficient and effective optimization. From F7–F12, it is evident that each strategy contributes to the superior performance of MSCOA, enhancing its exploration ability and enabling it to tackle complex optimization problems more effectively. From F13–F15, it can be observed that every strategy, to some extent, strengthens MSCOA’s global search capability, endowing it with better robustness and improved global optimization performance. Finally, from F16 and F17, it is clear that MSCOA also demonstrates strong exploration and exploitation abilities in high-dimensional complex environments.
Taken together, the algorithms applying the various improvement strategies show different degrees of performance improvement. The MSCOA algorithm incorporating the various improvement strategies has enhanced optimization search and solution capabilities. It provides better robustness in complex optimization problems, thus potentially avoiding premature convergence.
Comparison between different novel intelligent algorithms
MSCOA is compared against COA22, GWO15, Subtraction-Average-Based Optimizer (SABO)57, Dung Beetle Optimizer (DBO)13, Wind-Driven Optimization (WWPA)58, Chernobyl Disaster Optimizer (CDO)59, Lungs Performance-Based Optimization (LPO)60 and Artificial Protozoa Optimizer (APO)61 under the same parameter settings: a population size of 30, a spatial dimension of 20, and a maximum iteration count of 1000. (Many optimization algorithms require a sufficient number of iterations to gradually converge to the optimal or near-optimal solution. If the number of iterations is too low, the algorithm may not have enough time to explore the search space thoroughly, leading to suboptimal solutions. Conversely, an iteration count of 1000 generally provides most optimization algorithms with enough time to perform an adequate search and evolution process, increasing the likelihood of convergence to a satisfactory solution. Moreover, excessively increasing the number of iterations significantly raises computational time and resource consumption. For complex optimization problems with large population sizes, excessive iterations can make computations infeasible in practical applications due to prolonged execution times. A maximum iteration count of 1000 strikes a balance between convergence performance and computational cost. It ensures that the algorithm has sufficient iterations to achieve good convergence behavior without making the computational time excessively long. In the field of optimization algorithm research, many studies and related literature frequently use around 1000 iterations to evaluate algorithm performance. This standardization allows for a consistent and comparable assessment across different algorithms, enabling researchers to fairly evaluate the performance of new algorithms against existing ones). To objectively assess optimization performance, these parameter values were used for 30 different executions of each method. The optimal value, average value, and standard deviation for each algorithm were recorded. A lower average value indicates higher optimization accuracy, but a lower standard deviation signifies more stability. The test results are illustrated in Table 4, with the best outcomes are indicated by bold numerals. Figure 10 provides a comparative visualization of fitness curves across algorithms for functions f1 to f17, and Fig. 11 displays the corresponding box plots.
In Table 2, the results of running MSCOA and various basic algorithms on functions F1-F17 are shown. Table 2 indicates that MSCOA can determine the optimal value on the 5 tested functions F1, F2, F3, F4, F5, F8, F10, F14 and F16 for dimension dim = 30. For the functions F6, F7, F11 and F17, the optimization results are quite near to the ideal value even if MSCOA is unable to achieve the theoretical optimal value. Meanwhile, comparative analysis of the standard deviations (std) across functions F1-F17 reveals that MSCOA demonstrates significantly smaller deviations in the majority of cases. This shows that the MSCOA has strong convergence stability and good robustness.
Figure 10 shows that MSCOA exhibits strong local convergence speed in unimodal functions (f1-f6), quickly reaching the optimal value. In bimodal functions (f7-f12), MSCOA shows strong global search ability, successfully avoiding local optimization. For functions f13-f15, MSCOA demonstrates good adaptability, robustness, and strong global search capability. In high-dimensional multimodal functions, MSCOA still performs excellently in terms of search and exploration ability, even when facing complex situations. Overall, MSCOA performs better in searching, but there is still room for improvement in complex functions and high-dimensional spaces.
It is evident from Fig. 11 that MSCOA has superior overall performance over multiple experiments compared to the other algorithms, usually finding better solutions, more focused results and more stable performance. It clearly outperforms other algorithms with higher median, high fluctuations and outliers.
Comparison of Wilcoxon rank sum test
To assess the efficacy and optimization performance of MSCOA, this study used the Wilcoxon rank sum test62 to confirm if the outcomes of MSCOA’s running are statistically distinct from those of other algorithms at a significance threshold of a = 5%63. The hypothesis is disproved when p < a, suggesting that the two algorithms differ significantly from one another. In the event where p > a, the hypothesis is accepted, signifying that there is no discernible variation between the two methods. Table 4 displays the MSCOA, COA, SABO, GWO, WWPA, CDO, LPO and APO test findings at a significance level of a = 5%.
Table 4 shows that there are notable variations between the calculation results and the other seven methods, and that for the majority of test functions, the most of the p values of MSCOA are smaller than the significance level a.
Runtime analysis on different problem sizes
To supplement the theoretical complexity discussion and validate the scalability of MSCOA, this study conducts comprehensive experiments on benchmark functions F1–F4 across varying problem dimensions (10, 30, 50, and 100). Additionally, a log-log fitting analysis is performed to derive the empirical time complexity. Table 5 presents the average computational time (in seconds) of MSCOA and the baseline algorithms.
As shown in Table 3 the runtime of all algorithms increases with the problem dimensionality. MSCOA exhibits relatively higher computational time in high-dimensional scenarios (e.g., 100 dimensions), primarily due to the incorporation of enhanced strategies such as the Chaotic Reverse Initialization Strategy, which increases computational overhead.
However, in conjunction with Fig. 12 and the empirical time complexity (where MSCOA approximates O(n), the observed linear increase in runtime aligns well with theoretical expectations. Compared to other algorithms, which demonstrate a slower runtime growth, MSCOA balances computational efficiency and optimization performance, ensuring effective search capability while maintaining a reasonable trade-off between runtime and solution quality.
Scalability test of MSCOA for Large-Scale problems
In order to further verify the scalability of MSCOA in high-dimensional optimization tasks, we conducted additional experiments on complex optimization problems with dimensions of D = 200 and D = 500. Table 3 summarizes the mean, standard deviation, and minimum value metrics obtained by MSCOA on selected variable-dimensional benchmark functions F1–F4, F7–F10, F13, and F14. The results indicate that even with a substantial increase in dimensionality, MSCOA continues to exhibit relatively stable and competitive performance.
From Table 3, it can be observed that for most functions, the fitness values achieved by MSCOA remain close to the global optimum, accompanied by small standard deviations. This outcome implies that, despite the significant increase in dimensionality, MSCOA still maintains commendable convergence stability and accuracy. These findings are consistent with the time-complexity analysis presented in the previous section, which shows that the algorithm’s runtime grows linearly with the problem dimension. Hence, MSCOA demonstrates strong scalability in large-scale optimization tasks.
Application of MSCOA in predicting breast cancer
Extreme learning machine
Huang et al.64 introduced the Extreme Learning Machine (ELM) learning technique for a single hidden layer feedforward neural network. The structure of the ELM is illustrated in Fig. 13.
ELM involves solving a system where the weights between the input and hidden layer are randomly assigned, while the output weights \(\:{\upbeta\:}\) are determined by minimizing the difference between the predicted and target outputs. This is formulated as Eq. (27).
where H is the output matrix of the hidden layer, \(\:{\upbeta\:}\) represents the vector of output weights, and T is the matrix of target outputs.
The solution for the output weights is found using the Moore-Penrose inverse of H65, denoted as \(\:{\text{H}}^{+}\). The solution is expressed as:
This yields a unique optimal solution by merely defining the quantity of hidden nodes, without impacting the input weights or biases of the hidden neurons66. These characteristics have drawn a lot of academics to research the theory, applications, and extensions of ELM67,68. At the moment, supervised learning has been the primary focus, with promising results69,70.
MSCOA-ELM algorithm modeling
The MSCOA-ELM algorithm model is constructed based on the 2 algorithm models described in this paper. Using the MSCOA algorithm, the optimal parameters are obtained by carrying out the phase of heat avoidance, the competition phase, and the foraging phase, which are the 3 processes, and then input into the ELM algorithm to seek the optimal prediction effect. Figure 14 depicts the MSCOA-ELM algorithm model’s execution procedure.
In Fig. 14, ELM model using MSCOA are shown. First, after the algorithm starts, the parameters of MSCOA and ELM are initialized to ensure that all variables and parameters are set to their initial state. Next, the fitness value of each population individual is computed, and the individual’s merit is measured by evaluating the fitness of the population. Then, it enters the behavioral updating phase, executing refuge, competition and foraging behaviors sequentially, and constantly updating the positions of individuals in the population through Eq. (21), Eq. (8), Eq. (10), or Eq. (11), so as to seek the globally optimal solution. After each update, it checks whether the stopping condition is satisfied, and if it is not satisfied, the position update and fitness evaluation are repeated until the halting condition is fulfilled. When the halting condition is reached, the algorithm outputs the optimized MSCOA parameters and the corresponding ELM model to ensure that the model parameters are optimal. Finally, the algorithm ends and outputs the best optimization results for further model training and prediction.
Performance evaluation metrics
A confusion matrix is a matrix that categorizes the classification problem according to the two dimensions of the true situation and the discriminative situation, and the binary classification confusion matrix is displayed in Fig. 15.
The matrix has four components: TP (True Positive): Correctly classified positive examples.TN (True Negative): Correctly classified negative examples. FP (False Positive): Incorrectly classified as positive when they are negative (Type I error). FN (False Negative): Incorrectly classified as negative when they are positive (Type II error).
In this paper, 20% of the data was utilized for testing all models, while the remaining 80% was used for training. A number of performance metrics, including as each model’s accuracy, precision, recall, and F1 score, are assessed, based on the four variables of the confusion matrix. The following is an explanation of the performance measures:
The percentage of correctly anticipated events to all occurrence is known as accuracy.
The following ratio represents precision: the amount of correctly forecast positive cases divided by the entire amount of expected positive cases. The following is the formula 30.
Recall is described as the proportion of correctly anticipated positive events to each instance within actual class. The following formula is displayed.
F1score: the precision and recall weighted average, which has the following definition:
AUC-ROC curve: the model’s capacity to differentiate between categories is shown by the AUC, while the ROC is a probability curve. It indicates the model’s capacity for category discrimination. AUC indicates how well the model distinguishes between various groups. Plotting TPR on the y-axis against FPR on the x-axis is known as the ROC curve. FPR is defined as follows, and TRP is a synonym for recall:
Optimal hidden neurons in ELM
To determine the optimal number of hidden neurons in the ELM and assess its impact on performance, we conducted comparative experiments using the MSCOA-ELM model with configurations comprising 2, 5, 8, and 20 hidden neurons. The results indicated that when the number of hidden neurons was set to 5 or 20, the accuracy reached 0.9825. As shown in Fig. 16, the average optimal number of hidden neurons was 7.50, with an average optimal accuracy of 0.9912. A smaller number of neurons resulted in underfitting due to insufficient feature learning capacity, while an excessive number increased model complexity without significantly enhancing accuracy. Considering the trade-offs among model complexity, generalization ability, and result stability, we ultimately selected 8 hidden neurons. This configuration enables the ELM to effectively extract features while avoiding both underfitting and overfitting, thereby ensuring reliable performance in subsequent applications.
Results and discussion
1) generalization ability evaluation of MSCOA-ELM on diabetes dataset
Firstly, to evaluate the generalization capability of MSCOA-ELM, experiments were conducted on the diabetes dataset from the UCI Machine Learning Repository. The performance of each model index is shown in Table 4, and the ROC curve is shown in Fig. 17. As shown in Table 3, MSCOA-ELM exhibits marked improvements in accuracy, precision, recall, F1 score, and AUC compared to the original ELM model, indicating that the introduction of MSCOA can more effectively optimize ELM parameters and enhance the model’s feature learning and classification performance. However, from an overall perspective, certain indicators of SABO-ELM and APO-ELM remain slightly superior to those of MSCOA-ELM, suggesting that while MSCOA-ELM has already achieved a commendable performance on this dataset, there is still room for further improvement. In general, the experimental results of MSCOA-ELM on the diabetes dataset confirm its promising generalization ability, but also highlight the need for deeper research and exploration in algorithm refinement and parameter tuning.
2) ability evaluation of MSCOA-ELM on breast Cancer dataset
Subsequently, MSCOA is formally applied to the Wisconsin Diagnostic Breast Cancer dataset from the UCI dataset. From UCI machine learning repository website https://archive.ics.uci.edu/ml/index.php search Wisconsin Diagnostic Breast Cancer to download.
This section aims to use machine learning techniques to establish an adjunctive diagnostic for breast cancer prediction. Numerous models, including the recommended MSCOA-ELM, were assessed, tuned, and trained. Fig. 18 illustrates the outcomes of the confusion matrix experiments for all predictive models applied to the Wisconsin Diagnostic Breast Cancer dataset, alongside a visualization of the classification results. Based on metrics, each model’s performance was assessed, and Table 7 presents the findings.
As presented in Table 5, MSCOA-ELM achieved 100% in accuracy, precision, recall, and F1-score, demonstrating that MSCOA can robustly converge to the optimal set of weights and biases of the ELM model. GWO-ELM (accuracy = 0.976) and COA-ELM (accuracy = 0.967) approached the performance of MSCOA-ELM but exhibited relatively lower recall. DBO-ELM showed the highest recall (0.959) with a lower precision (0.946). SABO-ELM maintained a perfect precision of 1.0, yet its recall was merely 0.893. WWPA-ELM presented the lowest recall (0.852) among the models. CDO-ELM attained an F1-score of 0.927, surpassing WWPA-ELM’s 0.915, thereby highlighting its superior overall performance. The original ELM model exhibited notably lower accuracy (0.710) and F1-score (0.778), underscoring the necessity of metaheuristic algorithms for parameter optimization. The AUC values and ROC curve shown in Fig. 19 further validated these results. Meanwhile, the ROC curves and AUC metrics of other models reflected their trade-offs between true positive rate and false positive rate, aligning with the analyzed precision-recall dynamics.
3) robustness verification on noisy data
To validate the robustness of the MSCOA-ELM in real medical scenarios, this study introduces Gaussian noise and label flipping into the Wisconsin Diagnostic Breast Cancer dataset (WDBC) to simulate the measurement errors or label noise that may occur in actual clinical data. Specifically, Gaussian noise with a mean of 0 and a standard deviation of σ is added to the input features, with the noise level set at σ = 5%. In addition, 5% of the sample labels are randomly flipped to simulate mislabeling. After generating the noisy data, the dataset is partitioned into training and testing sets using the original 80%−20% split.
Experimental results, as shown in Table 6, reveal that on the original, noise-free data, the standard ELM achieves an accuracy of only 64.0351%, whereas the MSCOA-ELM, by optimizing the ELM parameters, attains an accuracy of 92.9825%, thereby demonstrating a stronger feature learning capability. After introducing noise, the standard ELM’s accuracy increased to 78.0702%, which may be attributed to overfitting to noise patterns; however, MSCOA-ELM maintained a high accuracy of 90.3509%. This indicates that MSCOA-ELM effectively filters out noise impacts and sustains model stability, making it more suitable for real-world medical scenarios where data imperfections are prevalent.
Conclusion and future work
This study proposes an Enhanced Crayfish Optimization Algorithm (MSCOA) by integrating chaotic reverse exploration initialization, an adaptive t-distributed feeding strategy, and a ternary optimization mechanism, significantly improving the algorithm’s global search capability, local exploitation efficiency, and dynamic balance performance.
Experimental results demonstrate that MSCOA generally outperforms or matches traditional COA and other metaheuristic algorithms on 17 benchmark functions from CEC2019 and CEC2005. For unimodal functions, MSCOA exhibits superior local exploitation efficiency and consistently identifies global optima. On multimodal, composite, and high-dimensional functions, MSCOA remains competitive, effectively avoiding local optima while balancing global exploration and local optimization. Wilcoxon rank-sum tests (95% confidence level) confirm significant performance differences between MSCOA and comparison algorithms.
Furthermore, MSCOA is integrated with Extreme Learning Machine (ELM) for breast cancer prediction. Evaluated on the Wisconsin Diagnostic Breast Cancer dataset (UCI), MSCOA-ELM achieves 100% accuracy, demonstrating its potential to assist clinicians in earlier and more accurate diagnoses.
However, there are still some limitations in the current study:
Theoretical limitations
-
While MSCOA improves the local-global balance, its performance on highly complex problems remains improvable, and the algorithm’s complexity may lead to increased computational overhead for large-scale applications.
-
Although the adaptive t-distribution feeding strategy and hybrid weights have been empirically validated, theoretical proofs of convergence remain incomplete.
-
Performance in spaces with over 1000 dimensions requires further validation, as the inherent randomness of the initial chaotic mapping may impact stability.
-
In noisy environments, the performance of MSCOA may deteriorate due to the influence of measurement errors and imperfect data on fitness evaluation, which can result in suboptimal solutions.
-
The current version of MSCOA is primarily designed for single-objective optimization, and its effectiveness in multi-objective optimization problems has not been sufficiently validated.
Practical limitations
-
Model performance is heavily dependent on the quality of medical data and is sensitive to noise and class imbalance.
-
Training MSCOA-ELM on embedded systems may face resource constraints compared to lighter-weight algorithms.
-
The training time of MSCOA-ELM may be limited in embedded devices when compared to more efficient models.
Future work can be improved in the following areas:
-
Enhance MSCOA via novel mechanisms (e.g., multi-strategy fusion) to broaden its applicability across various fields such as finance, energy, and industrial optimization.
-
Integrate machine learning with metaheuristics to improve adaptability.
-
Apply model pruning or quantization techniques to reduce resource consumption in edge computing environments.
-
Investigate strategies to enhance MSCOA’s robustness against noise to improve performance in noisy environments.
-
Develop multi-objective extensions of MSCOA to enhance its applicability and reliability in complex real-world scenarios.
Finally, at the policy level, MSCOA-ELM provides a cost-effective AI tool for early breast cancer screening, supporting standardized diagnostic workflows.
-
Governments and healthcare institutions should prioritize funding for “optimization algorithm + medical AI” research.
-
establish data-sharing platforms to bridge regional healthcare disparities.
-
Open-sourcing the MSCOA-ELM framework could democratize access to advanced diagnostic tools in developing regions.
Data availability
The source code and datasets used in this study are available upon request by contacting the corresponding author at victoria@stu.sxufe.edu.c.
References
Hu, G., Guo, Y., Wei, G. & Abualigah, L. Genghis Khan shark optimizer: a novel nature-inspired algorithm for engineering optimization. Adv. Eng. Inf. 58, 102210 (2023).
Abdollahzadeh, B. et al. Puma optimizer (PO): A novel metaheuristic optimization algorithm and its application in machine learning. Clust Comput. 27 5235–5283 (2024).
Oyelade, O. N., Ezugwu, A. E. S., Mohamed, T. I. & Abualigah, L. Ebola optimization search algorithm: A new nature-inspired metaheuristic optimization algorithm. IEEE Access10, 16150–16177 (2022).
Shishavan, S. T. & Gharehchopogh, F. S. An improved cuckoo search optimization algorithm with genetic algorithm for community detection in complex networks. Multimed. Tools Appl.81, 25205–25231 (2022).
Li, J. et al. Application of XGBoost algorithm in the optimization of pollutant concentration. Atmos. Res. 276, 106238 (2022).
Mohammadi, F. G., Amini, M. H. & Arabnia, H. R. Evolutionary computation, optimization, and learning algorithms for data science. Optim., Learn., Control Interdep. Complex Networks 37–65 (2020).
Yang, X. S. & Karamanoglu, M. Nature-inspired computation and swarm intelligence: a state-of-the-art overview. eds Yang, X. S. & Karamanoglu, M. Nature-Inspired Comput. Swarm Intell. 3–18 (Academic Press, London, 2020).
Pelusi, D. et al. Improving exploration and exploitation via a hyperbolic gravitational search algorithm. Knowledge-Based Systems193, 105404 (2020).
Liu, H., Zhang, X., Liang, H. & Tu, L. Stability analysis of the human behavior-based particle swarm optimization without stagnation assumption. Expert Syst. Appl.159, 113638 (2020).
Abdel-Basset, M., Mohamed, R., Jameel, M. & Abouhawwash, M. Spider wasp optimizer: A novel meta-heuristic optimization algorithm. Artif. Intell. Rev.56, 11675–11738 (2023).
Akbari, M. A. et al. The cheetah optimizer: A nature-inspired metaheuristic algorithm for large-scale optimization problems. Sci. Rep.12, 10953 (2022).
Agushaka, J. O., Ezugwu, A. E. & Abualigah, L. Dwarf mongoose optimization algorithm. Comput. Methods Appl. Mech. Eng.391, 114570 (2022).
Xue, J. & Shen, B. Dung beetle optimizer: A new meta-heuristic algorithm for global optimization. J. Supercomput.79, 7305–7336 (2023).
Agushaka, J. O., Ezugwu, A. E. & Abualigah, L. Gazelle optimization algorithm: A novel nature-inspired metaheuristic optimizer. Neural Comput. Appl.35, 4099–4131 (2023).
Mirjalili, S., Mirjalili, S. M. & Lewis, A. Grey wolf optimizer. Adv. Eng. Softw.69, 46–61 (2014).
Falahah, I. A. et al. Frilled Lizard optimization: A novelbio‑inspired optimizer for solving engineering applications. J. Artif. Intell. Rev. 79, 3631–3678 (2024).
Makhadmeh, S. N. et al. Recent advances in grey Wolf optimizer, its versions and applications. IEEE Access. 11, 12345–12380 (2023).
Chhabra, A., Hussien, A. G. & Hashim, F. A. Improved bald eagle search algorithm for global optimization and feature selection. Alexandria Eng. J.68, 141–180 (2023).
Pan, J. S. et al. A parallel compact gannet optimization algorithm for solving engineering optimization problems. Mathematics11, 439 (2023).
Ikram, R. M. A. et al. A novel swarm intelligence: cuckoo optimization algorithm (COA) and sailfish optimizer (SFO) in landslide susceptibility assessment. Stoch. Environ. Res. Risk Assess. 37, 1717–1743 (2023).
Houssein, E. H. & Sayed, A. Dynamic candidate solution boosted beluga whale optimization algorithm for biomedical classification. Mathematics11, 707 (2023).
Jovanovic, D. et al. Tuning machine learning models using a group search firefly algorithm for credit card fraud detection. Mathematics10, 2272 (2022).
Han, E. & Ghadimi, N. Model identification of proton-exchange membrane fuel cells based on a hybrid convolutional neural network and extreme learning machine optimized by improved honey Badger algorithm. Sustain. Energy Technol. Assess. 52, 102005 (2022).
Ewees, A. A., Al-qaness, M. A., Abualigah, L. & Abd elaziz, M. HBO-LSTM: Optimized long short term memory with heap-based optimizer for wind power forecasting. Energy Convers. Manag.268, 116022 (2022).
Ma, J. et al. A comprehensive comparison among metaheuristics (MHs) for geohazard modeling using machine learning: insights from a case study of landslide displacement prediction. Eng. Appl. Artif. Intell. 114, 105150 (2022).
Jovanovic, A. et al. Particle swarm optimization tuned multi-headed long short-term memory networks approach for fuel prices forecasting. J. Netw. Comput. Appl. 233, 104048 (2025).
Jia, H., Rao, H., Wen, C. & Mirjalili, S. Crayfish optimization algorithm. Artif. Intell. Rev. 56, 1919–1979 (2023).
Fakhouri, H. N. et al. Novel hybrid crayfish optimization algorithm and self-adaptive differential evolution for solving complex optimization problems. Symmetry16, 927 (2024).
Sorour, S. E. et al. An improved binary crayfish optimization algorithm for handling feature selection task in supervised classification. Mathematics 12, 2364 (2024).
Sait, S. M., Mehta, P., Yıldız, A. R. & Yıldız, B. S. Optimal design of structural engineering components using artificial neural network-assisted crayfish algorithm. Mater. Test.66, 1–10 (2024).
Shikoun, N. H., Al-Eraqi, A. S. & Fathi, I. S. BinCOA: An efficient binary crayfish optimization algorithm for feature selection. IEEE Access12, 28621–28635 (2024).
AbdElminaam, D. S. et al. Parameters extraction of the Three-Diode photovoltaic model using crayfish optimization algorithm. IEEE Access. 12, 1–15 (2024).
Sugitha, G. & Chaluvaraj, P. B. Self-Attention conditional generative adversarial network optimised with crayfish optimization algorithm for improving cyber security in cloud computing. Comput. Secur. 140, 103773 (2024).
Chaib, L. et al. Improved crayfish optimization algorithm for parameters Estimation of photovoltaic models. Energy Convers. Manag. 313, 118627 (2024).
Patel, V. V. et al. A systematic kidney tumor segmentation and classification framework using adaptive and Attentive-based deep learning networks with improved crayfish optimization algorithm. IEEE Access. 12, 1–15 (2024).
Chauhan, S. et al. Parallel structure of crayfish optimization with arithmetic optimization for classifying the friction behaviour of Ti-6Al-4V alloy for complex machinery applications. Knowl. -Based Syst. 286, 111389 (2024).
Yakubu, I. Z. & Murali, M. An efficient IoT-Fog-Cloud resource allocation framework based on Two-Stage approach. IEEE Access. 12, 1–15 (2024).
Jia, W. et al. Research on bearing remaining useful life anti-noise prediction based on fusion of color-grayscale time-frequency features. Meas. Sci. Technol.35, 123456 (2024).
Jiang, M. et al. Research on time-series based and similarity search based methods for PV power prediction. Energy Convers. Manag. 308, 118391 (2024).
Huang, H., Gao, W. & Yang, G. Distribution transformer mechanical faults diagnosis method incorporating cross-domain feature extraction and recognition of unknown-type faults. Measurement 238, 115270 (2024).
Almutairi, S. Z. et al. A hierarchical optimization approach to maximize hosting capacity for electric vehicles and renewable energy sources through demand response and transmission expansion planning. Sci. Rep.14, 15765 (2024).
Zhong, R., Fan, Q., Zhang, C. & Yu, J. Hybrid remora crayfish optimization for engineering and wireless sensor network coverage optimization. Clust Comput 27 10141–10168 (2024).
Zhang, Y., Liu, P. & Li, Y. Implementation of an enhanced crayfish optimization algorithm. Biomimetics 9, 341 (2024).
Jia, H. et al. Modified crayfish optimization algorithm for solving multiple engineering application problems. Artif. Intell. Rev.57, 127 (2024).
Wang, R., Zhang, S. & Zou, G. An improved multi-strategy crayfish optimization algorithm for solving numerical optimization problems. Biomimetics9, 361 (2024).
Kouba, A., Petrusek, A. & Kozák, P. Continental-wide distribution of crayfish species in europe: update and maps. Knowl. Manag Aquat. Ecosyst. 413, 05 (2014).
Allan, E. L., Froneman, P. W. & Hodgson, A. N. Effects of temperature and salinity on the standard metabolic rate (SMR) of the caridean shrimp Palaemon peringueyi. J. Exp. Mar. Biol. Ecol.337, 103–108 (2006).
Larson, E. R. & Olden, J. D. The state of crayfish in the Pacific Northwest. Fisheries 36, 60–73 (2011).
Bellman, K. L. & Krasne, F. B. Adaptive complexity of interactions between feeding and escape in crayfish. Science 221, 779–781 (1983).
Gürses, D. et al. A novel hybrid water wave optimization algorithm for solving complex constrained engineering problems. Mater. Test. 63, 560–564 (2021).
Cui, F. et al. Soil water content Estimation using ground penetrating radar data via group intelligence optimization algorithms: an application in the Northern Shaanxi coal mining area. Energy Explor. Exploit. 39, 318–335 (2021).
Zhu, S. et al. A new one-dimensional compound chaotic system and its application in high-speed image encryption. Appl. Sci. 11, 11206 (2021).
Guang-Yi, W. & Fang, Y. Cascade chaos and its dynamic characteristics. Chin. Phys. B. 22, 050504 (2013).
Rahnamayan, S., Tizhoosh, H. R. & Salama, M. M. Opposition versus randomness in soft computing techniques. Appl. Soft Comput.8, 906–918 (2008).
Tizhoosh, H. R. Opposition-based learning: a new scheme for machine intelligence. Proc. CIMCA-IAWTIC’06. 1, 695–701 (2005).
Rahnamayan, S., Tizhoosh, H. R. & Salama, M. M. Quasi-oppositional differential evolution. Proc. IEEE CEC 2229–2236 (2007).
Trojovský, P. & Dehghani, M. Subtraction-average-based optimizer: A new swarm-inspired metaheuristic algorithm for solving optimization problems. Biomimetics8, 149 (2023).
Bayraktar, Z., Komurcu, M. & Werner, D. H. Wind Driven Optimization (WDO): A novel nature-inspired optimization algorithm and its application to electromagnetics. Proc. IEEE APS 1–4 (2010).
Shehadeh, H. A. Chernobyl disaster optimizer (CDO): a novel meta-heuristic method for global optimization. Neural Comput. Appl. 35, 10733–10749 (2023).
Ghasemi, M. et al. Optimization based on performance of lungs in body: lungs performance-based optimization (LPO). Comput. Methods Appl. Mech. Eng. 419, 116582 (2024).
Wang, X. et al. Artificial protozoa optimizer (APO): A novel bio-inspired metaheuristic algorithm for engineering optimization. Knowledge-Based Systems295, 111737 (2024).
Derrac, J. et al. A practical tutorial on the use of nonparametric statistical tests as a methodology for comparing evolutionary and swarm intelligence algorithms. Swarm Evol. Comput. 1, 3–18 (2011).
Hashim, F. A. et al. Henry gas solubility optimization: A novel physics-based algorithm. Future Gener Comput. Syst. 101, 646–667 (2019).
Huang, G. B., Ding, X. & Zhou, H. Optimization method based extreme learning machine for classification. Neurocomputing 74, 155–163 (2010).
Banerjee, K. S. Generalized Inverse of Matrices and its Applications (Wiley, 1973).
Huang, G. B., Zhou, H., Ding, X. & Zhang, R. Extreme learning machine for regression and multiclass classification. IEEE Trans. Syst. Man. Cybern. B42, 513–529 (2011).
Wang, Z., Qu, Q., Yu, G. & Kang, Y. Breast tumor detection in double views mammography based on extreme learning machine. Neural Comput. Appl. 27, 227–240 (2016).
Wang, Z. et al. Breast tumor detection in digital mammography based on extreme learning machine. Neurocomputing 128, 175–184 (2014).
Huang, G. B., Wang, D. H. & Lan, Y. Extreme learning machines: A survey. Int. J. Mach. Learn. Cybern.2, 107–122 (2011).
Huang, G. B., Zhu, Q. Y. & Siew, C. K. Extreme learning machine: Theory and applications. Neurocomputing70, 489–501 (2006).
Author information
Authors and Affiliations
Contributions
Y.L. conceived the research, developed the Hybrid Cumulative Strategy-Enhanced Crayfish Optimization Algorithm (MSCOA), conducted the experiments, analyzed the data, and wrote the manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Li, YJ. Hybrid strategy enhanced crayfish optimization algorithm for breast cancer prediction. Sci Rep 15, 29146 (2025). https://doi.org/10.1038/s41598-025-11129-0
Received:
Accepted:
Published:
Version of record:
DOI: https://doi.org/10.1038/s41598-025-11129-0






























