Introduction

As global economic integration continues to deepen, enterprises are increasingly exposed to complex and volatile market competition and operational risks. Among these, bankruptcy risk has emerged as one of the most critical threats to corporate survival and sustainable development1. Since the global financial crisis in 2008, the cascading effects of corporate bankruptcies have not only impacted specific industries but have also posed systemic risks to regional economic stability and the global financial system at large. For instance, the collapse of large enterprises can lead to disrupted supply chains, widespread unemployment, and a surge in non-performing bank loans. Meanwhile, the mass failure of small and medium-sized enterprises (SMEs) directly undermines regional economic vitality. Statistics indicate that globally, 3%–5% of registered companies go bankrupt each year due to poor management, with especially high bankruptcy rates in industries such as manufacturing and retail2,3. Accurate prediction of corporate bankruptcy risk is therefore essential. It provides critical decision-making support for business managers, enabling timely adjustments to strategies that help avoid crises. Furthermore, it offers early warning signals to investors, creditors, and regulatory authorities, thereby contributing to the stability and orderly operation of financial markets4.

Bankruptcy prediction fundamentally involves analyzing a company’s historical operational data—such as financial indicators and market performance—to build mathematical models that assess its likelihood of survival over a future time horizon. This constitutes a classic binary classification problem5,6. Early research in bankruptcy prediction primarily relied on traditional statistical methods, including multiple linear regression, logistic regression, and discriminant analysis. These approaches typically construct predictive models by linearly combining financial ratios—such as debt-to-asset ratio and current ratio—and offer advantages such as conceptual simplicity and strong interpretability7. However, these conventional methods generally assume specific data distributions and linear relationships among variables. Such assumptions make it difficult to capture the nonlinear and dynamic characteristics inherent in corporate operations. As a result, the predictive accuracy of traditional statistical models is often limited under complex and rapidly changing economic conditions.

With the rapid advancement of artificial intelligence technologies, a wide range of machine learning-based intelligent models have been extensively applied to bankruptcy prediction, significantly improving predictive performance. These models include Artificial Neural Networks (ANN)8, Backpropagation Neural Networks (BP)9, Support Vector Machines (SVM)10,11, Decision Trees12, Extreme Learning Machines (ELM)13 and Kernel Extreme Learning Machines (KELM)14. Owing to their powerful nonlinear fitting capabilities and adaptive learning mechanisms, these models can extract latent patterns from massive datasets and effectively handle complex data structures that traditional methods struggle with. Among them, Kernel Extreme Learning Machine (KELM), an improved variant of ELM, introduces a kernel function to map input data into a high-dimensional feature space. This enhancement not only preserves the advantages of ELM—such as fast training speed and strong generalization ability—but also significantly improves its ability to handle nonlinear problems. KELM has demonstrated outstanding performance in fields such as financial risk forecasting and medical diagnosis15,16.

Despite the theoretical and practical strengths of the KELM model, its predictive performance is highly sensitive to the selection of two key hyperparameters: the penalty parameter (C) and the kernel parameter (γ). The penalty parameter C controls the trade-off between model complexity and fitting error, while the kernel parameter γ determines how input data are mapped into a high-dimensional feature space. The choice of these parameters directly affects the model’s classification accuracy and generalization ability17. Conventional parameter optimization methods—such as grid search and random search—are often inefficient and prone to getting trapped in local optima, making it difficult to identify the globally optimal parameter configuration18. As a result, integrating efficient intelligent optimization algorithms to fine-tune the KELM parameters has become a crucial approach to enhancing its prediction performance.

Swarm intelligence optimization algorithms, inspired by the collective behaviors of biological populations in nature19, have been widely adopted for KELM parameter optimization due to their strong global search capabilities and simple implementation. For example, Particle Swarm Optimization (PSO)20 simulates the foraging behavior of bird flocks to locate optimal parameters; Grey Wolf Optimizer (GWO)21 is inspired by the hunting strategies of grey wolves; and Whale Optimization Algorithm (WOA)22 mimics the bubble-net hunting behavior of humpback whales to iteratively update solutions. Lu et al.23 proposed an Active Operators Particle Swarm Optimization algorithm (APSO) to obtain the optimal initial parameter set for KELM, thereby creating an optimized classifier named APSO-KELM. Li et al.24 introduced a Biogeography-Based Optimization Extreme Learning Machine (BBO-KELM) for ultra-short-term wind power forecasting in different regions. Wang et al.25 developed a new KELM parameter tuning strategy for bankruptcy prediction using Grey Wolf Optimizer. Li et al.26 proposed a hybrid model based on KELM for predicting the intermittent pumping interval of wells, where parameters were optimized using an Improved Brain Storm Optimization (IBSO-KELM) algorithm. Han27 presented a Short-Term Power Load Forecasting (STPLF) model, in which KELM was optimized via an Improved Whale Optimization Algorithm (IWOA). While these approaches have improved the efficiency of KELM parameter optimization to some extent, they still face several limitations. Some algorithms converge slowly and are prone to being trapped in local optima when solving complex optimization problems. Others exhibit poor adaptability in high-dimensional spaces, with significant declines in accuracy when the parameter search range is expanded.

According to the “No Free Lunch” theorem28, no single optimization algorithm can perform best across all problems, thereby motivating continuous research and improvement of existing methods. Therefore, the effectiveness of an optimization strategy is closely related to the intrinsic characteristics of the target problem. Bankruptcy prediction datasets typically exhibit strong nonlinearity, feature sparsity, and complex interactions among financial ratios. In such scenarios, conventional local search mechanisms are prone to premature convergence. Lévy flight is characterized by a heavy-tailed step-length distribution, which enables occasional long jumps during the search process. This property is particularly suitable for navigating highly nonlinear and sparse feature spaces, as it helps the optimizer escape local optima while still maintaining local refinement capability. In addition, bankruptcy-related parameters such as the penalty factor C and kernel parameter γ of KELM often lie near the boundary of the search space. Therefore, a hybrid boundary handling strategy is introduced to preserve gradient information near the boundaries and improve search efficiency in these critical regions. The combination of elite-guided Lévy mutation and hybrid boundary handling thus aligns well with the data characteristics of bankruptcy prediction problems, providing a problem-driven justification for the proposed enhancements rather than relying on generic algorithmic modifications.

The Zebra Optimization Algorithm (ZOA) is a novel swarm intelligence algorithm inspired by the foraging behavior and defensive strategies of zebras. It achieves global optimization by simulating the cooperative mechanisms within zebra groups29. Compared to other algorithms, ZOA features conceptual simplicity and fast convergence. However, in practical applications, it suffers from several limitations, including rapid loss of population diversity and insufficient local exploitation ability30,31. For instance, in high-dimensional parameter optimization tasks, the standard ZOA is prone to premature convergence due to its relatively simple search pattern, resulting in suboptimal tuning of KELM parameters and reduced bankruptcy prediction accuracy.

To address these issues, this paper proposes an Enhanced Archive-Based Zebra Optimization Algorithm (EAZOA) and applies it to optimize key parameters of KELM, thereby constructing the EAZOA-KELM bankruptcy prediction model. First, an elite-guided Levy mutation strategy is introduced to construct an elite pool composed of the top three individuals with the best fitness values. By leveraging the random perturbation characteristics of Levy flight, this strategy enhances both global exploration and local exploitation, effectively preventing premature loss of population diversity. Second, a dynamic elite archive mechanism is designed. It employs a ring-buffer structure to store historically optimal solutions. Through the collaborative operation of an elite queue and a diversity queue, the algorithm makes full use of past search experiences, significantly improving its adaptability in dynamic environments. Finally, a hybrid boundary handling technique is incorporated. This technique integrates dynamic boundary contraction, probabilistic mixed repair, and gradient-preserving strategies to address the low search efficiency of standard ZOA near the boundary regions. It ensures that the parameter search process remains both comprehensive and precise. The synergistic integration of these three innovations enables systematic enhancements in exploration–exploitation balance, dynamic adaptability, and boundary-handling efficiency, thereby significantly improving the overall performance of the optimization algorithm.

Existing enhanced ZOA variants focus on single-dimensional improvements: Chaotic ZOA30 introduces chaos to initial population but lacks archive mechanism; Enhanced ZOA31 optimizes for engineering problems but ignores boundary handling. In contrast, EAZOA integrates three synergistic innovations: (1) Elite-guided Lévy mutation (combines global exploration and local exploitation, unlike standalone Lévy mutation in other algorithms); (2) Dynamic ring-buffer archive (preserves historical solutions, addressing ZOA’s “memoryless” limitation); (3) Hybrid boundary handling (targets financial data’s boundary sensitivity, not considered in previous ZOA enhancements). This multi-faceted improvement makes EAZOA more adaptable to bankruptcy prediction’s unique data characteristics than existing ZOA variants.”

The main contributions of this paper are as follows:

  1. (1)

    An improved Zebra Optimization Algorithm (EAZOA) is proposed, which incorporates an elite archive mechanism and a Levy mutation strategy. These enhancements significantly improve the algorithm’s global search capability and convergence accuracy, offering an efficient solution for complex parameter optimization problems.

  2. (2)

    A bankruptcy prediction model, EAZOA-KELM, is developed by using EAZOA to optimize the penalty parameter and kernel parameter of KELM. The synergistic integration of both methods leads to improved prediction accuracy and stability.

  3. (3)

    The proposed model is validated on the publicly available Wieslaw financial dataset. Comparative experiments with PSO-KELM, GWO-KELM, and ZOA-KELM demonstrate the superior performance of the proposed method in bankruptcy prediction, providing a novel and effective tool for corporate risk early warning.

The remainder of this paper is organized as follows: section “Zebra optimization algorithm and the proposed methodology” introduces the fundamental principles of ZOA and the improvements made in EAZOA. Section  “Numerical experiments” validates the effectiveness of the proposed EAZOA using the CEC2020 and CEC2022 benchmark suites. Section  “Classification prediction of bankruptcy prediction problem” presents the EAZOA-KELM bankruptcy prediction model and compares it with other benchmark models. Section  “Summary and prospect” concludes the study and discusses directions for future research.

Zebra optimization algorithm and the proposed methodology

Zebra optimization algorithm (ZOA)

Initialization

In the Zebra Optimization Algorithm (ZOA), the values of the decision variables are determined by the positions of individual zebras in the search space. Each zebra can be modeled as a member of the ZOA population, where each element represents the value of a specific problem variable. The entire zebra population can be mathematically formulated using a matrix representation, with the initial positions of zebras randomly distributed across the search space. Similar to other heuristic algorithms, ZOA begins by generating a set of candidate solutions at random, as defined in Eq. (1)29:

$$X={\left[ {\begin{array}{*{20}{c}} {{X_1}} \\ {{X_2}} \\ \vdots \\ {{X_i}} \\ \vdots \\ {{X_N}} \end{array}} \right]_{N \times dim}}={\left[ {\begin{array}{*{20}{c}} {\begin{array}{*{20}{c}} {{x_{1,1}}}&{{x_{1,2}}}& \cdots \\ {{x_{2,1}}}&{{x_{2,2}}}& \cdots \\ \vdots & \vdots & \ddots \end{array}}&{\begin{array}{*{20}{c}} {{x_{1,j}}}& \cdots &{{x_{1,dim}}} \\ {{x_{2,j}}}& \cdots &{{x_{2,dim}}} \\ \vdots & \ddots & \vdots \end{array}} \\ {\begin{array}{*{20}{c}} {{x_{i,1}}}&{{x_{i,2}}}& \cdots \\ \vdots & \vdots & \ddots \\ {{x_{N,1}}}&{{x_{N,2}}}& \cdots \end{array}}&{\begin{array}{*{20}{c}} {{x_{i,{\text{j}}}}}& \cdots &{{x_{i,dim}}} \\ \vdots & \ddots & \vdots \\ {{x_{N,j}}}& \cdots &{{x_{N,dim}}} \end{array}} \end{array}} \right]_{N \times dim{\text{~}}}}$$
(1)

where, X represents the entire zebra population, \({X_i}\) denotes the position vector of the \({i^{th}}\) zebra, \({x_{i,j}}{\text{~is~the~value~of~the~}}{j{th}}\) decision variable for the \({i^{th}}\) zebra, N is the population size, and \(dim\) is the dimensionality of the problem.

The initial positions of zebras in the optimization space are generated using a random initialization strategy, as defined by Eq. (2):

$$\begin{array}{*{20}{c}} {{X_{i,j}}=\left( {u{b_j} - l{b_j}} \right) \times {r_1}+l{b_j}} \end{array}$$
(2)

where \({X_{i,j}}\) is the initial value of the \({j{th}}\) decision variable for the \({i{th}}\) candidate solution, \(u{b_j}\) and \(l{b_j}\) denote the upper and lower bounds, and \({r_1}\)is a random number uniformly distributed in the range (0, 1).

Exploration: foraging behavior

In the initial phase, zebras primarily feed on grasses and sedges. However, when their preferred food sources are scarce, they also consume shoots, berries, bark, roots, and fallen leaves. Depending on the quality and abundance of vegetation, zebras may spend 60% to 80% of their time grazing. One species, known as the plains zebra, acts as a grazing pioneer by consuming the upper and canopy layers of less nutritious grasses, thereby creating space for other species that require shorter and more nutritious grasses29. In ZOA, the best individual in the population is regarded as the pioneer zebra, which guides other members towards promising locations in the search space. The position update of zebras during the foraging phase can be modeled using Eqs. (3) and (4):

$$\begin{array}{*{20}{c}} {{X_{i,j}}\left( {t+1} \right)={X_{i,j}}\left( t \right)+{r_2} \times \left( {P{Z_j}\left( t \right) - I \times {X_{i,j}}\left( t \right)} \right)} \end{array}$$
(3)
$$\begin{array}{*{20}{c}} {{X_{i,j}}\left( {t+1} \right)=\left\{ {\begin{array}{*{20}{c}} {{X_{i,j}}\left( {t+1} \right),~\;\;\;\;\;\;\;\;\;\;\;\;\;~if~f\left( {t+1} \right)<f\left( t \right)} \\ {{X_{i,j}}\left( t \right),~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~else} \end{array}} \right.} \end{array}$$
(4)

where \({X_{i,j}}\left( {t+1} \right)\) and \({X_{i,j}}\left( t \right)\) represent the position of the \({i^{th}}\) zebra for the \({j{th}}\) variable at the current and previous iterations, respectively. The scalar \({r_2}\) is a random number uniformly distributed in the interval [0, 1], and I is a randomly selected integer from the set \(\left\{ {1,2} \right\}\). \(P{Z_j}\left( t \right)\) denotes the position of the best-performing pioneer zebra at iteration t. The fitness values of the candidate solution at iterations \(t+1\) and t are represented by \(f\left( {t+1} \right)\) and \(f\left( t \right)\), respectively.

Exploitation: defense strategies against predators

At this stage, the primary predator of zebras is the lion, but they also face threats from wild cheetahs, leopards, wild dogs, brown hyenas, crocodiles, and spotted dogs. The defense strategies of zebras vary depending on the predator. Against large predators such as lions and cheetahs, zebras employ zigzag running and random lateral movements to escape. Against smaller predators like wildebeests and hares, zebras exhibit more aggressive behavior by grouping together to confuse and intimidate the attackers. The ZOA design assumes that one of the following two scenarios occurs with equal probability29:

  1. (1)

    Lions launch an attack on zebras, prompting the zebras to flee;

  2. (2)

    Zebras adopt an aggressive strategy in response to attacks by other predators.

In the first scenario, when zebras are attacked by lions, they seek refuge near their current positions. This behavior is modeled by strategy \({S_1}\) in Eq. (5). In the second scenario, when other predators attack, the rest of the zebra group moves toward the targeted zebra to form a defensive formation, thereby intimidating and confusing the predators. This behavior is mathematically represented by strategy \({S_2}\) in Eq. (5). When updating a zebra’s position, the new position is accepted only if it improves the objective function value. This update condition is described in Eq. (6)29:

$$\begin{array}{*{20}{c}} {{X_{i,j}}\left( {t+1} \right)=\left\{ {\begin{array}{*{20}{c}} {{S_1}:{X_{i,j}}\left( t \right)+R \times \left( {2{r_3} - 1} \right) \times \left( {1 - t/T} \right) \times {X_{i,j}}\left( t \right),~\;\;\;\;\;\;\;\;\;\;\;\;\;~if~{P_S}<0.5} \\ {{S_2}:{X_{i,j}}\left( t \right)+{r_3} \times \left( {A{Z_j}\left( t \right) - I \times {X_{i,j}}\left( t \right)} \right),~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~else} \end{array}} \right.} \end{array}$$
(5)
$$\begin{array}{*{20}{c}} {{X_{i,j}}\left( {t+1} \right)=\left\{ {\begin{array}{*{20}{c}} {{X_{i,j}}\left( {t+1} \right),~\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;~if~f\left( {t+1} \right)<f\left( t \right)} \\ {{X_{i,j}}\left( t \right),~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~else} \end{array}} \right.} \end{array}$$
(6)

where t is the current iteration number, T is the maximum number of iterations, R is a constant set to 0.01, and \({P_S}\) is the probability used to switch between the two strategies, sampled uniformly from the interval [0, 1]. \({r_3}\) is another random number within [0, 1], and \(A{Z_j}\left( t \right)\) denotes the position of the attacked zebra.

Proposed elite archive zebra optimization algorithm (EAZOA)

The novelty of EAZOA lies in the synergistic integration of three complementary strategies, rather than isolated modifications: Elite-guided Lévy mutation ensures diversity while maintaining directionality; dynamic elite archive leverages historical experience; hybrid boundary handling optimizes for financial data’s critical thresholds. This design addresses ZOA’s core limitations (premature convergence, poor boundary search, lack of memory) simultaneously, which is not achieved by existing enhanced ZOA versions.

Elite-Guided Lévy mutation strategy

The standard ZOA algorithm, during the foraging phase, approaches the global optimum via simple random perturbations. This mechanism often leads to rapid loss of population diversity and premature convergence to local optima, especially in complex multimodal function optimization31. To address this issue, this paper proposes an elite-guided Levy mutation strategy. An elite pool consisting of the top three individuals with the best fitness is constructed. With a 10% probability, the guiding target is set as the mean position of these elite individuals; with a 90% probability, a single elite individual is randomly selected as the guiding target. On this basis, a random perturbation following the Levy distribution is introduced. The core update formula is as follows:

$$\begin{array}{*{20}{c}} {{X_{i,j}}\left( {t+1} \right)=Elit{e_j}\left( t \right)+L\left( \beta \right) \times \left( {Elit{e_j}\left( t \right) - {X_{i,j}}\left( t \right)} \right)} \end{array}$$
(7)

where, the elite target \(Elite\) is generated by Eq. (8), and the Levy flight step \(L\left( \beta \right)\) is implemented using Mantegna’s algorithm as shown in Eq. (9):

$$\begin{array}{*{20}{c}} {Elite=\left\{ {\begin{array}{*{20}{c}} {\frac{1}{3}\mathop \sum \limits_{{k=1}}^{3} {X_k},~~~~~~~~~~\;\;\;\;\;\;\;\;\;\;if~rand<0.1} \\ {{X_k},k\sim U\left\{ {1,2,3} \right\},~~~~~~~~~otherwise} \end{array}} \right.} \end{array}$$
(8)
$$\begin{array}{*{20}{c}} {L\left( \beta \right)=0.5 \times \frac{{\emptyset \times {\text{\varvec{\Gamma}}}\left( {1+\beta } \right) \times {\text{sin}}\left( {\pi \beta /2} \right)}}{{{{\left| v \right|}^{1/\beta }}}},~~\emptyset ,v~\sim N\left( {0,1} \right)} \end{array}$$
(9)

where \({X_k}~\left( {k=1,2,3} \right)\) denote the global best, second best, and third best solutions, respectively.

This strategy leverages the heavy-tailed characteristic of the Levy distribution to generate large jumps in the early stages of the algorithm, thereby enhancing global exploration. In later stages, it improves local exploitation efficiency by finely guiding the search based on elite individuals.

Elite archive mechanism

The traditional ZOA algorithm relies solely on the current population’s best individual to guide the search. This “memoryless” mechanism limits the algorithm’s ability to leverage historical search experience, resulting in poor adaptability in dynamic environments. To overcome this limitation, this paper designs a dynamic elite archive mechanism, as illustrated in Fig. 1. The archive employs a ring-buffer structure to store historically optimal solutions. It consists of two queues: an elite queue that focuses on preserving high-quality solutions, and a diversity queue that maintains spatial distribution among solutions. Regarding the update strategy, a new solution \({X_i}\) is admitted into the archive \(\mathcal{A}\) only if it satisfies either of the following conditions: its fitness \(f\left( {{X_i}} \right)\) is better than the worst solution in the archive, or its spatial dissimilarity (measured by Hamming Distance, \(HD\)) from existing archive members exceeds half the maximum diversity \({\text{D}}\). This can be expressed as:

$$\begin{array}{*{20}{c}} {A \leftarrow A \cup \left\{ {{X_i}} \right\},~~if~f\left( {{X_i}} \right)\left\langle {\hbox{max} \left( {f\left( \mathcal{A} \right)~or~HD\left( {{X_i}} \right)} \right)} \right\rangle 0.5D} \end{array}$$
(10)

When the archive overflows, an improved tournament selection mechanism combining fitness ranking and crowding distance is employed to decide which solutions to retain. The selection score for each candidate \({X_j}\) is computed as:

$$\begin{array}{*{20}{c}} {Score\left( {{X_j}} \right)=0.7 \times \frac{{rank\left( {f\left( {{X_j}} \right)} \right)}}{{\left| \mathcal{A} \right|}}+0.3 \times \frac{{\mathop \sum \nolimits_{{k=1}}^{{\left| \mathcal{A} \right|}} \left\| {{X_j} - {X_k}} \right\|}}{{{D_{max}}}}} \end{array}$$
(11)

where \(rank\left( {f\left( {{X_i}} \right)} \right)\) is the fitness rank of \({X_j}\) within the archive \(\mathcal{A}\), \(\left| \mathcal{A} \right|\) is the archive size, \(\left\| {{X_j} - {X_k}} \right\|\) denotes the Euclidean distance between solutions \({X_j}\) and \({X_k}\), and \({D_{max}}\) is the maximum diversity distance in the archive.

The global best solution is then selected based on the updated archive.

$$\begin{array}{*{20}{c}} {{X_{best}}=arm~\mathop {\hbox{min} }\limits_{{X \in \mathcal{A}}} f\left( {{X_i}} \right)} \end{array}$$
(12)
Fig. 1
Fig. 1
Full size image

Schematic diagram of elite archive mechanism.

Hybrid boundary treatment

The standard ZOA algorithm adopts a simple boundary truncation method to handle out-of-bound individuals. This approach discards gradient information near the boundaries, resulting in low search efficiency around optimal solutions located at the boundary regions. To address this issue, this paper proposes a hybrid boundary handling technique, as illustrated in Fig. 2. Firstly, a dynamic boundary shrinking strategy gradually tightens the search range during iterations, where the shrinkage rate is related to the position of the current best solution. Next, out-of-bound individuals are repaired via a probabilistic mixed method: with 40% probability, the individual is moved toward the global best solution; with another 40%, a mirror reflection is performed; and with 20%, the individual is randomly reset within the current bounds. Additionally, for individuals near the boundary, a gradient preservation strategy is introduced to ensure search efficiency in boundary regions.

Fig. 2
Fig. 2
Full size image

Comparison of boundary treatment methods.

The mathematical formulation of the dynamic boundary shrinking mechanism is given by Eq. (13):

$$\begin{array}{*{20}{c}} {\left\{ {\begin{array}{*{20}{c}} {lb_{j}^{{\left( t \right)}}=l{b_j}+0.3{{\left( {1 - \frac{t}{T}} \right)}^{1.5}}\left( {P{Z_j} - l{b_j}} \right)~~~~~} \\ {ub_{j}^{{\left( t \right)}}=u{b_j} - 0.3{{\left( {1 - \frac{t}{T}} \right)}^{1.5}}\left( {u{b_j} - {X_{best}}_{j}} \right)} \end{array}} \right.} \end{array}$$
(13)

This nonlinear shrinking strategy allows the search range to intelligently adjust based on the current best solution’s position and iteration progress. It maintains a relatively wide search space in early stages to enhance exploration, while gradually focusing on promising regions in later stages to improve exploitation accuracy.

For out-of-bound individuals, a probabilistic three-mode repair operator is designed as follows:

$$\begin{array}{*{20}{c}} {\left\{ {\begin{array}{*{20}{c}} {P{Z_j}+N\left( {0,0.1} \right) \times \left( {ub_{j}^{{\left( t \right)}} - {X_{best}}_{j}} \right),~~\;\;Pr=0.4} \\ {2ub_{j}^{{\left( t \right)}} - {X_{ij}},~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\;\;Pr=0.4} \\ {lb_{j}^{{\left( t \right)}}+U\left( {0,1} \right) \times \left( {ub_{j}^{{\left( t \right)}} - lb_{j}^{{\left( t \right)}}} \right),~~~~~~~~~~~~\;\;~~Pr=0.2} \end{array}} \right.} \end{array}$$
(14)

This mixed approach preserves a 40% probability of guided convergence by moving toward the global best solution to exploit existing knowledge; includes a 40% mirror reflection strategy to maintain diversity near boundaries; and incorporates a 20% random reset to avoid completely losing potentially high-quality areas outside the boundaries.

Specifically, for individuals close to the boundary (i.e.,\(\left( {\left| {{X_{i,j}} - bound} \right|<0.1D} \right)\), a gradient preservation strategy is applied:

$$\begin{array}{*{20}{c}} {{X_{i,j}}\left( {t+1} \right)=bound+sign\left( {\nabla f} \right) \times 0.1D \times {{\left( {1 - t/T} \right)}^2}} \end{array}$$
(15)

This strategy extracts gradient information near the boundary to enable fine-grained search along the boundary direction, while adaptively reducing the step size with iteration progress, ultimately achieving stable convergence.

In summary, this hybrid technique, through the organic integration of dynamic boundary shrinking, probabilistic mixed repair, and gradient preservation strategies, significantly enhances the algorithm’s search efficiency in boundary regions.

Based on the above discussion, the pseudocode for EAZOA is presented in Algorithm 1.

figure a

Algorithm 1. Pseudo-Code of EAZOA.

Computational complexity analysis

Let \(\:N\) denote the population size, D the problem dimensionality, and T the maximum number of iterations. The computational complexity of the standard ZOA is \(O\left( {T \cdot N \cdot D} \right)\), mainly arising from position updates and fitness evaluations.

In EAZOA, additional computational overhead is introduced by three components: (1) elite-guided Lévy mutation, (2) elite archive maintenance, and (3) hybrid boundary handling. The Lévy mutation involves only a small subset of elite individuals and does not change the order of complexity. The elite archive employs a ring-buffer structure with a fixed size \(A \ll N\), and its update operation has a complexity of \(O\left( A \right)\). Tournament selection based on crowding distance is applied only when the archive overflows and thus incurs negligible overhead in practice.

Overall, the time complexity of EAZOA remains \(\:O(T\cdot\:N\cdot\:D)\), which is of the same order as standard ZOA. Empirical runtime comparisons indicate that the slight increase in computation is offset by faster convergence and improved solution quality.

Numerical experiments

Algorithm parameter settings

In this section, the performance of the proposed EAZOA is evaluated using the highly challenging CEC202032 and CEC202233 numerical optimization benchmark suite, and its results are compared with several well-established algorithms, including Particle Swarm Optimization (PSO)20, Grey Wolf Optimizer (GWO)21, Whale Optimization Algorithm (WOA)22, Status-Based Optimization (SBO)34, Ivy Algorithm(IVYA)35, Animated Oat Optimization (AOO)36 and the standard Zebra Optimization Algorithm (ZOA)29. The parameter settings for each algorithm are listed in Table 1.

Table 1 Algorithm parameter settings.

To ensure fairness and eliminate randomness in the experiments, all algorithms are configured with a fixed population size of 30 and a maximum of 500 iterations. Each algorithm is independently executed 30 times. The performance is evaluated in terms of the average (Ave), standard deviation (Std), and rank (Rank), with the best results highlighted in bold.

All experiments were conducted on a Windows 11 operating system with a hardware configuration of a 13th-generation Intel(R) Core(TM) i5-13400 CPU @ 2.5 GHz, 16GB RAM, using MATLAB 2024b as the simulation platform.

Experimental results

This section evaluates the overall optimization performance of the proposed EAZOA in comparison with seven well-established metaheuristic algorithms on the CEC2020 and CEC2022 benchmark suites. The detailed numerical results are reported in Tables 2, 3, 4 and 5 for different problem dimensions, while the discussion here focuses on the general performance trends rather than reiterating specific numerical values already presented in the Tables 2, 3, 4 and 5.

Table 2 Experimental results of CEC2020 (dim = 10).
Table 3 Experimental results of CEC2020 (dim = 20).
Table 4 Experimental results of CEC2022 (dim = 10).
Table 5 Experimental results of CEC2022 (dim = 20).

The experimental results on the CEC2020 and CEC2022 test suites (Tables 2, 3, 4 and 5) demonstrate that the proposed EAZOA algorithm exhibits significant advantages in solving high-dimensional optimization problems. On the CEC2020 test suite with a dimension of 10, EAZOA achieved an average rank (Ave. Rank) of 1.20 across 10 benchmark functions and ranked first in the Friedman test. It significantly outperformed other algorithms such as PSO (4.10), GWO (3.50), and WOA (6.60). Particularly, on functions F1, F5, and F7, EAZOA obtained average fitness values (Ave) of 2.0356E+03, 5.7127E+03, and 4.5231E+03, respectively, which are substantially lower than those of the standard ZOA (7.5318E+08, 3.5573E+04, and 2.7713E+04). Additionally, EAZOA exhibited smaller standard deviations (Std), indicating better stability. When the problem dimension increased to 20, the superiority of EAZOA became even more pronounced. It achieved an average rank of 1.30, and on function F1, the average fitness value (2.6920E+03) was approximately 5.6 × 10⁻⁷ of that obtained by the standard ZOA (4.7764E+09), further confirming the effectiveness of EAZOA in tackling complex high-dimensional problems.

On the CEC2022 test suite with a dimension of 10, EAZOA achieved an average rank of 1.42. It produced near-theoretical optimal results on several functions, such as F1, F3, and F9—for instance, on F1, it achieved an average fitness value (Ave) of 3.0000E+02 with a remarkably small standard deviation (Std) of 8.7126E-09. Moreover, EAZOA demonstrated superior performance on complex multimodal functions like F5 and F6. Specifically, on F6, its average fitness value (3.5024E+03) outperformed that of the standard ZOA (3.3731E+03), indicating a stronger ability to avoid local optima. When the dimension increased to 20, EAZOA maintained its leading position with an average rank of 1.50. On function F1, the average fitness value was 4.7445E+02, which is only 6.9% of that obtained by PSO (6.7492E+03). On F10, EAZOA achieved an average value of 3.0096E+03, representing a 13.3% improvement over ZOA (3.4723E+03). These results strongly demonstrate EAZOA’s effectiveness in dynamically adapting its search strategies to higher-dimensional spaces.

Overall, EAZOA’s high accuracy on unimodal functions (e.g., F2 and F3) and strong robustness on multimodal functions (e.g., F5 and F7) can be attributed to the synergy between its elite archive mechanism and hybrid boundary handling strategy. Compared with the standard ZOA and other intelligent optimization algorithms, EAZOA delivers comprehensive improvements in convergence precision, stability, and scalability, offering a more efficient solution for tackling complex optimization problems.

Additionally, the rank distribution illustrated in Fig. 3 provides a visual summary of the overall performance rankings of different algorithms on the CEC2020 (dimensions 10 and 20) and CEC2022 (dimensions 10 and 20) benchmark functions. As shown in the figure, EAZOA consistently ranks at the top across all test scenarios. Its performance significantly surpasses that of the comparison algorithms, including the standard ZOA.

Regardless of the problem dimension—whether low-dimensional (10) or high-dimensional (20)—EAZOA’s rank distribution is more concentrated and skewed toward the optimal range. This observation aligns well with the tabulated results, where EAZOA consistently secures the first place in both average ranking and Friedman ranking, further confirming its stable superiority across various dimensions and function types.

In contrast, other algorithms such as SBO and WOA tend to exhibit rankings clustered in the lower performance range, while ZOA also lags noticeably behind EAZOA. These results clearly demonstrate the effectiveness of EAZOA’s enhancements—such as the elite archive mechanism, elite-guided Lévy mutation strategy, and hybrid boundary handling—in significantly improving overall optimization performance.

Fig. 3
Fig. 3
Full size image

The ranking distribution of different algorithms in the test functions.

Convergence speed analysis

To further investigate the convergence behavior of different algorithms during the optimization process, this section presents the convergence curves of each algorithm on the CEC2020 and CEC2022 benchmark functions. The detailed convergence results are illustrated in Fig. 4. To ensure fairness, all algorithms were configured with the same parameter settings as those used in the previous experiments. The convergence curves represent the average fitness values over 30 independent runs for each algorithm.

Fig. 4
Fig. 4
Full size image

Part of the test function iterates convergence curves.

The convergence curves presented in Fig. 4 further highlight the performance advantages of EAZOA during the iterative optimization process. As observed from the convergence trajectories on the CEC2020 and CEC2022 test functions, EAZOA consistently exhibits faster convergence rates and higher convergence accuracy across most benchmark functions.

For example, on the CEC2020-F1 function with 10 dimensions, EAZOA rapidly approaches the optimal solution in the early stages of iteration. Within just 50 iterations, its fitness value drops significantly below those of other algorithms, and it continues to decline steadily, ultimately achieving a much better final value compared to PSO, GWO, WOA, and the standard ZOA. On higher-dimensional problems such as CEC2020-F2 and F3 (20 dimensions), EAZOA also demonstrates a clear advantage, entering the stable convergence phase much earlier than competing algorithms. In contrast, algorithms like SBO and WOA show slower convergence and tend to get trapped in local optima.

For the CEC2022 functions, such as F1 and F6 with 10 dimensions, EAZOA’s convergence speed advantage becomes even more pronounced. In particular, on F6, its curve diverges early from the others and eventually converges to a value close to the theoretical optimum. On high-dimensional functions like CEC2022-F3 and F10 (20 dimensions), EAZOA shows superior convergence stability, maintaining a downward trend even in the later iterations. Meanwhile, the standard ZOA and several other algorithms exhibit stagnation or premature convergence.

This outstanding convergence performance can be attributed to two key mechanisms within EAZOA: the elite-guided Lévy mutation strategy and the dynamic elite archive mechanism. The former enhances global exploration through the long-tailed property of the Lévy distribution, while the latter improves local exploitation by preserving high-quality historical solutions. Together, these mechanisms enable the algorithm to rapidly identify promising regions in the search space and perform fine-grained local searches during the optimization process.

Search pattern analysis

This section systematically analyzes the search behavior and optimization performance of EAZOA based on five key indicators: search history, average fitness, search trajectory, and convergence curve37, using the experimental results on the CEC2022 test functions with 20 dimensions (Fig. 5).

The comparison of average fitness curves shows that EAZOA consistently approaches the global optimum more effectively than the standard ZOA. The overall population performance improves steadily throughout the iterations, demonstrating EAZOA’s advantage in maintaining population diversity while effectively guiding the optimization direction. The search trajectory further reveals EAZOA’s dynamic search strategy—initially performing broad global exploration to identify promising regions, then transitioning into refined local exploitation. This “exploration-first, exploitation-later” pattern allows EAZOA to efficiently balance global search and local refinement.

The convergence curves further validate EAZOA’s fast convergence capability, as it consistently reaches stable optimal solutions earlier across various test functions. Visualization of the search history shows that EAZOA does not follow a fixed search path but dynamically adjusts its behavior based on the optimization process: it explores a wider solution space in the early stages and gradually concentrates on high-quality regions in later stages. This adaptability is particularly evident in function F11, where EAZOA successfully approaches the global optimum.

These results highlight the synergy between the elite-guided Lévy mutation strategy—which enhances global exploration via the long-tailed property of the Lévy distribution while leveraging elite solutions for focused local development—and the dynamic elite archive mechanism, which exploits historical best solutions to guide the search. Together, these mechanisms significantly improve the optimization efficiency and accuracy of EAZOA.

Fig. 5
Fig. 5
Full size image

Convergence behavior of EAZOA and ZOA.

Runtime comparison analysis

To evaluate the computational efficiency of the proposed EAZOA, a runtime comparison between EAZOA and the standard ZOA was conducted on the CEC2020 benchmark functions with dimensionality set to 20. Both algorithms were independently executed 30 times under identical experimental settings, and the average runtime was recorded for each test function. The comparative results are illustrated in Fig. 6.

Fig. 6
Fig. 6
Full size image

Comparison of average runtime between EAZOA and ZOA.

As shown in Fig. 6, EAZOA exhibits a slightly higher average runtime than ZOA across most benchmark functions. This increase is expected, as EAZOA incorporates additional mechanisms—including the elite archive maintenance, elite-guided Lévy mutation, and hybrid boundary handling—that introduce moderate computational overhead. However, the observed runtime increase remains marginal. For most test functions (F1–F7), the average runtime difference between EAZOA and ZOA is within a few milliseconds, indicating that the additional operations do not significantly affect overall computational efficiency.

For more complex functions (e.g., F8–F10), both algorithms require longer execution times due to increased search difficulty and fitness evaluation costs. Even in these cases, the runtime of EAZOA remains comparable to that of ZOA. Notably, the relative increase in runtime does not scale disproportionately with problem complexity, demonstrating that the proposed enhancements do not compromise scalability. In some functions (e.g., F9), the runtime difference is even negligible, suggesting that faster convergence behavior can partially offset the added algorithmic complexity.

Overall, the results indicate that EAZOA achieves a favorable trade-off between computational cost and optimization performance. Although the incorporation of elite archiving and hybrid search strategies introduces a slight increase in runtime, this cost is minimal compared to the substantial improvements in convergence accuracy, robustness, and solution quality demonstrated in sections “Experimental results”–“Search pattern analysis”. Therefore, EAZOA can be considered a computationally efficient enhancement of ZOA, suitable for practical applications where solution quality and stability are prioritized.

Classification of bankruptcy prediction problem

KELM mathematical model

Kernel Extreme Learning Machine (KELM) is an enhanced version of the traditional Extreme Learning Machine (ELM), designed to solve classification and regression problems38. It improves upon the original ELM model by introducing kernel methods, thereby extending its applicability to nonlinear problems and further enhancing its generalization performance and learning speed39,40.

The core idea of ELM is to randomly initialize the input weights and biases of the hidden layer neurons, rather than iteratively adjusting them, which results in a unique optimal solution. Given n training samples \(\left( {{x_i},{t_i}} \right)\), a single hidden layer feedforward neural network (SLFN) with K hidden neurons and activation function \(g\left( x \right)\) is considered. Each input \({x_i} \in {R^n}{x_i}\) is \(n \times 1\) feature vector, and each target \({t_i}\) is an \(m \times 1\)output vector. The output of the SLFN can be defined by the following model41,42,43:

$$\begin{array}{*{20}{c}} {{O_j}=~~\mathop \sum \limits_{{i=1}}^{K} {\beta _i}g\left( {{w_i} \times {x_i}+{b_i}} \right)~~~~~~i=1,2, \ldots ,n;~j=1,2, \ldots ,n~~} \end{array}$$
(16)

where, \({O_j}\) is the output corresponding to the \({j{th}}\) input, \({b_i}\) is the output weight vector from the \({i{th}}\) hidden neuron to the output layer, \({w_i}\) is the input weight vector, and \(g\left( {{w_i} \times {x_i}+{b_i}} \right)\) is the activation function output of the \({i{th}}\)hidden neuron.

The learning objective of ELM is to minimize training error. When an SLFN can fit n training samples with zero error, the following condition holds:

$$\begin{array}{*{20}{c}} {\mathop \sum \limits_{{i=1}}^{n} \left\| {{t_i} \times {o_i}} \right\|=0} \end{array}$$
(17)

That is, there exist \({\beta _i}\), \({w_i}\), and \({b_i}\) such that:

$$\begin{array}{*{20}{c}} {{t_j}={o_j}=~~\mathop \sum \limits_{{i=1}}^{K} {\beta _i}g\left( {{w_i} \times {x_i}+{b_i}} \right)~~~~~~j=1,2, \ldots ,n~} \end{array}$$
(18)
$$\begin{array}{*{20}{c}} {T=h\left( x \right)\beta =H \times \beta } \end{array}$$
(19)

where \(T = \left[ {t_{1} ,~t_{2} ,~...,~t_{n} } \right]^{T}\), \(b~ = ~\left[ {b_{1} ~,~b_{2} ,...,~b_{K} } \right]^{T}\), and \(h\left( x \right)\) is the feature mapping function that maps input data into a K-dimensional feature space. The hidden layer output matrix \(H \in {R^{n \times K}}\) is given by:

$$\begin{array}{*{20}{c}} {H=h\left( x \right)={{\left[ {\begin{array}{*{20}{c}} {{h_1}\left( {{x_j}} \right)} \\ {\begin{array}{*{20}{c}} \vdots \\ {{h_K}\left( {{x_j}} \right)} \end{array}} \end{array}} \right]}^T}={{\left[ {\begin{array}{*{20}{c}} {g\left( {{w_1} \times {x_1}+{b_i}} \right)} \\ \vdots \\ {g\left( {{w_1} \times {x_n}+{b_i}} \right)} \end{array}\begin{array}{*{20}{c}} \cdots \\ \ddots \\ \cdots \end{array}\begin{array}{*{20}{c}} {g\left( {{w_K} \times {x_1}+{b_i}} \right)} \\ \vdots \\ {g\left( {{w_K} \times {x_1}+{b_i}} \right)} \end{array}} \right]}_{n \times K}}} \end{array}$$
(20)

In the training process of SLFNs, the input weights \(~{w_i}\) and biases \({b_i}\) do not require adjustment and can be randomly assigned. The output weights \(\beta\) can be analytically determined using:

$$\begin{array}{*{20}{c}} {\beta ^{\prime}={H^+} \times T} \end{array}$$
(21)

where \({H^+}\) is the Moore-Penrose generalized inverse of H, computed using orthogonal projection:

$$\begin{array}{*{20}{c}} {{H^+}~=~{H^T}{{\left( {H{H^T}} \right)}^{ - 1}}} \end{array}$$
(22)

The Moore-Penrose inverse provides the minimum-norm solution among all least-squares solutions, ensuring improved learning speed and strong generalization ability.

To further enhance generalization performance, Huang et al.44 proposed the kernel-based ELM (KELM), which is superior to the least-squares-based ELM. The improvement involves adding a regularization term C to the diagonal of \({H^T}H\), yielding the output weights:

$$\begin{array}{*{20}{c}} {\beta ={H^T}{{\left( {\frac{I}{C}+H{H^T}} \right)}^{ - 1}}T} \end{array}$$
(23)

where\(~C\) is the regularization (penalty) parameter and I is the identity matrix. The corresponding output function becomes:

$$\begin{array}{*{20}{c}} {F\left( x \right)=h\left( x \right){H^T}{{\left( {\frac{I}{C}+H{H^T}} \right)}^{ - 1}}T} \end{array}$$
(24)

The kernel matrix \({{\text{\varvec{\Omega}}}_{ELM}}\) is constructed as:

$$\begin{array}{*{20}{c}} {{{\text{\varvec{\Omega}}}_{ELM}}=H{H^T}:{\text{~}}{{\text{\varvec{\Omega}}}_{EL{M_{i,j}}}}=h{{\left( {{x_i}} \right)}^T}h\left( {{x_j}} \right)=K\left( {{x_i},{x_j}} \right)} \end{array}$$
(25)

where \(K\left( {{x_i},{x_j}} \right)\) is the kernel function. Thus, the final output function of KELM can be expressed as:\(\begin{array}{*{20}{c}} {F\left( x \right)={{\left[ {\begin{array}{*{20}{c}} {K\left( {x,{x_1}} \right)} \\ \ldots \\ {K\left( {x,{x_n}} \right)} \end{array}} \right]}^T} \times {{\left( {\frac{I}{C}+{{\text{\varvec{\Omega}}}_{ELM}}} \right)}^{ - 1}}T} \end{array}\) (26)

KELM, as a kernel-based implementation of ELM, brings enhanced stability and generalization performance to the traditional ELM framework through its kernel representation. The schematic architecture of the KELM model is briefly illustrated in Fig. 6, where the kernel function replaces the conventional feature mapping function, enabling the transformation from the input space to the feature space. This means that the network output no longer depends on an explicit hidden layer feature mapping; instead, it is directly determined by the kernel function. Moreover, the dimension of the feature space or the hidden layer representation is no longer predefined45.

Fig. 7
Fig. 7
Full size image

The schematic architecture of the KELM model.

In this study, the Gaussian radial basis function (RBF) is employed as the kernel function for KELM, defined as follows:

$$\begin{array}{*{20}{c}} {K\left( {u,v} \right)=exp\left( { - \gamma {{\left\| {u - v} \right\|}^2}} \right)} \end{array}$$
(27)

The penalty parameter C and the kernel parameter \(\gamma\) play critical roles in model construction. The penalty parameter C controls the trade-off between minimizing the fitting error and model complexity. The kernel parameter \(\gamma\) defines the nonlinear mapping from the input space to the specific high-dimensional feature space of the hidden layer. Typically, to enhance the performance of KELM, these two key parameters are optimally selected using appropriate optimization algorithms45.

EAZOA-KELM

In KELM, the values of the penalty factor C and the kernel parameter γhave a significant impact on classification performance. To enhance the model’s applicability to the target problem, it is essential to identify an appropriate combination of these parameters. In this study, the EAZOA algorithm is employed to optimize the penalty factor C and kernel parameter \(\gamma\) of the KELM model. The parameter search ranges are listed in Table 6.

The proposed model consists of two main components. The first component is responsible for optimizing parameters within the inner loop, while the second evaluates classification performance in the outer loop. During the parameter optimization process, EAZOA dynamically adjusts the KELM parameters. The optimized parameters are then incorporated into the KELM classifier to perform the classification task. The fitness function is designed based on classification error, and is defined as follows:

$$\begin{array}{*{20}{c}} {Fitness=Error=\frac{1}{K}\mathop \sum \limits_{i}^{K} \frac{{Misclassified~Sample{s_i}}}{{Total~Sample{s_i}}}} \end{array}$$
(28)

where K represents the number of folds in 10-fold cross-validation (i.e., \(K=10\) ), \(Misclassified~Sample{s_i}\) denotes the number of misclassified samples in the \({i{th}}\) validation fold, and \(Total~Sample{s_i}\)is the total number of samples in that fold.

Table 6 Parameter range.

The overall framework of the KELM classification prediction model optimized by EAZOA is illustrated in Fig. 7. To ensure the reliability and unbiasedness of the case study results, a 10-fold cross-validation (CV) was employed to evaluate the performance of the classifier, while an internal 5-fold CV was used to optimize the two key parameters of the classifier. In each experiment, 7 out of the 10 subsets were selected as training data, while the remaining subsets served as test data. Additionally, the fifth subset extracted from the specified samples was used as the validation dataset. This experimental design provides an unbiased estimate of generalization accuracy, thereby enhancing the credibility of the results.

It is worth noting that a stratified sampling strategy was adopted to divide the dataset, ensuring that the proportion of non-bankrupt and bankrupt companies in each fold was consistent. Due to the potential variability introduced by random sampling, a single run of 10-fold CV may not yield sufficiently stable classification accuracy. Therefore, each method was subjected to 20 independent runs of 10-fold CV, and the average results from these 20 runs were used as the final evaluation outcomes (FIg. 8).

Fig. 8
Fig. 8
Full size image

Corporate bankruptcy prediction process based on EAZOA-KELM.

The present study utilizes financial datasets—such as the Wieslaw dataset—to evaluate the effectiveness of the proposed EAZOA-KELM model in addressing real-world classification problems. These problems are well-known in the existing literature, making them suitable for validating the potential of optimization-based approaches. The two experimental datasets originate from different sources. The Wieslaw dataset46 comprises 30 financial ratios and 240 samples, collected from 112 bankrupt Polish companies and 128 non-bankrupt companies. The data spans a period of five years and includes information from approximately three years prior to the occurrence of bankruptcy.

Classification of bankruptcy prediction

Model evaluation index

The Wieslaw dataset exhibits mild class imbalance (112 bankrupt vs. 128 non-bankrupt firms). To mitigate potential bias, a stratified sampling strategy was adopted during cross-validation to ensure consistent class distribution across folds. No additional resampling techniques were applied to avoid information leakage. Prior to model training, all features were normalized using min–max scaling. The experiments were implemented in MATLAB 2024b using built-in matrix operations and custom optimization scripts.

Although AUC is an important metric for imbalanced classification, this study primarily focuses on Accuracy, Precision, Recall, and F1-Score for consistency with related literature. Incorporating AUC analysis will be considered in future work.

The most commonly used metrics for evaluating multi-class classification performance are employed to assess the effectiveness of the proposed method. These evaluation metrics include Accuracy, Precision, Recall, and F1-Score. The definitions of these four metrics are as follows47,48,49:

$$\begin{array}{*{20}{c}} {Accuracy=\frac{{TP+TN}}{{TP+FN+FP+TN}}~;} \end{array}$$
(29)
$$\begin{array}{*{20}{c}} {Precision=\frac{{TP}}{{TP+FP}}~;} \end{array}$$
(30)
$$\begin{array}{*{20}{c}} {Recall=\frac{{TP}}{{TP+FN}}~;} \end{array}$$
(31)
$$\begin{array}{*{20}{c}} {F1 - Score=2 \times \frac{{Precision \times Recall}}{{Precision+Recall}}~;} \end{array}$$
(32)

where true positive (TP) represents the test sample belongs to category x predicted as category x, false positive (FP) represents any other category predicted as category xs, false negative (FN) represents category x predicted as any other category, and true negative (TN) represents any other category not predicted as category x.

Classification experiment results and analysis

Table 7 presents the classification performance of different methods in terms of Accuracy, Precision, Recall, and F1-Score. As core evaluation metrics, these indicators reflect the classification performance of the models from multiple perspectives. Figure 9 illustrates the convergence curves of the fitness function during the iterative optimization process of KELM by different algorithms. Figure 10 compares the classification results of KELM optimized by various algorithms on the Wieslaw dataset using boxplots.

Table 7 Comparison of classification results of KELM models optimized by algorithm.
Fig. 9
Fig. 9
Full size image

The fitness curves of KELM optimized by different algorithms are obtained.

Fig. 10
Fig. 10
Full size image

Box graph of KELM evaluation indicators optimized by different algorithm on Wieslaw data.

The classification results on the Wieslaw dataset (Table 7) demonstrate that the KELM model optimized by EAZOA (EAZOA-KELM) achieved the best performance across all evaluation metrics. Specifically, it reached an Accuracy of 76.32%, Precision of 74.34%, Recall of 75.55%, and an F1-Score of 77.13%, significantly outperforming other benchmark models. Compared to the standard ZOA-optimized KELM (ZOA-KELM), EAZOA-KELM improved Accuracy by 0.83% points and F1-Score by 1.69 points. These gains indicate that the incorporation of the elite archiving mechanism and hybrid boundary handling strategy not only enhanced the optimization capability of the algorithm but also indirectly boosted the overall classification performance of the model.

The convergence curves of the fitness function (Fig. 9) further reveal the advantages of EAZOA in parameter optimization. Within the early iterations (approximately the first 10), EAZOA rapidly reduced the classification error to below 0.125, while other algorithms such as ZOA and PSO remained at significantly higher error levels. As iterations progressed, EAZOA consistently maintained the lowest error region, confirming its ability to quickly identify optimal parameter combinations through the elite-guided Lévy mutation strategy.

Figure 10 shown in the form of boxplots, visually compares the performance distribution of KELM models optimized by different algorithms across multiple experimental runs. The EAZOA-KELM model not only displays the highest performance across all metrics but also shows shorter box lengths, indicating lower variance and greater stability during the 20 repetitions of 10-fold cross-validation. This stability stems from the dynamic elite archiving mechanism, which effectively leverages historical best solutions and reduces the randomness in the parameter optimization process.

Overall, the EAZOA-KELM model exhibits high accuracy and strong robustness in the bankruptcy prediction classification task, validating the practical effectiveness and superiority of the proposed improvements.

The EAZOA-KELM model provides actionable value for financial risk management: (1) For enterprises, it enables early bankruptcy warning by analyzing routine financial data, supporting proactive strategy adjustments (e.g., debt restructuring, cost reduction); (2) For investors and creditors, it offers a data-driven tool to assess investment/credit risks, reducing non-performing assets; (3) For regulatory authorities, it aids in monitoring regional economic stability by identifying high-risk enterprises. The model’s fast training speed (≈ 1.87s per 10-fold CV) and high accuracy (76.32%) make it suitable for real-time risk assessment systems.

Summary and prospect

This study proposed a corporate bankruptcy prediction framework that integrates an Elite Archive-based Zebra Optimization Algorithm (EAZOA) with a Kernel Extreme Learning Machine (KELM). To address the premature convergence and limited exploitation capability of the standard Zebra Optimization Algorithm, three complementary enhancement strategies were introduced: (1) an elite-guided Lévy mutation strategy to improve the balance between global exploration and local exploitation, (2) a dynamic elite archive mechanism to retain high-quality historical solutions and enhance algorithm adaptability, and (3) a hybrid boundary handling technique to preserve search efficiency in boundary-sensitive regions. Extensive numerical experiments on the CEC2020 and CEC2022 benchmark suites demonstrated that EAZOA consistently outperforms several state-of-the-art metaheuristic algorithms, including PSO and GWO, in terms of convergence accuracy, robustness, and stability across unimodal, multimodal, and high-dimensional optimization problems.

When applied to the parameter optimization of KELM, the resulting EAZOA-KELM model achieved superior classification performance on the Wieslaw bankruptcy dataset. Specifically, the proposed model obtained an accuracy of 76.32%, a precision of 74.34%, a recall of 75.55%, and an F1-score of 77.13%, outperforming competing optimization-based KELM models such as PSO-KELM and GWO-KELM. These results confirm that the enhanced optimization capability of EAZOA can effectively translate into improved predictive performance, highlighting the practical potential of the proposed approach as an intelligent decision-support tool for corporate financial risk early warning.

Despite these encouraging results, several limitations of the present study should be acknowledged. First, the bankruptcy prediction experiments were conducted using only a single public dataset (Wieslaw), which is relatively small and country-specific. Although this dataset is widely used in the literature, relying on a single dataset may limit the generalizability of the conclusions. Future studies should validate the proposed model on larger and more diverse bankruptcy datasets from different countries, industries, and economic environments to further assess its robustness and practical applicability. Second, the current study focuses primarily on optimizing the penalty and kernel parameters of KELM, while the feature set remains fixed. Redundant or weakly informative financial indicators may still affect model performance, suggesting that joint optimization of feature selection and classifier parameters could further improve prediction accuracy.

In addition, while multiple evaluation metrics were employed, including accuracy, precision, recall, F1-score, and AUC, the model was evaluated under a static experimental setting. Real-world bankruptcy prediction often involves evolving financial conditions and temporal dependencies, which are not explicitly modeled in this study. Moreover, the computational cost introduced by the elite archive and hybrid boundary handling mechanisms, although moderate, may become more significant for very large-scale datasets.

Future research will therefore focus on several promising directions. At the algorithmic level, adaptive parameter control mechanisms could be incorporated to dynamically adjust the Lévy mutation probability and archive update thresholds during the optimization process. Multi-objective optimization strategies may also be explored to simultaneously optimize classification performance, model complexity, and feature relevance. At the application level, extending the proposed framework to other financial risk prediction tasks—such as credit default assessment and market risk forecasting—would further demonstrate its versatility. Finally, integrating the proposed approach with deep learning models (e.g., LSTM for time-series financial data) and privacy-preserving paradigms such as federated learning could significantly enhance its applicability in real-world, data-sensitive financial environments.