Introduction

The vehicle routing problem (VRP) stands as a classic topic in combinatorial optimization, commanding significant attention in both academic and practical engineering domains. Among them, the transportation scheduling problem considering vehicle capacity is particularly critical, known as the capacitated vehicle routing problem (CVRP). It is an NP-hard problem that directly impacts the cost and efficiency of goods transportation.

Traditional solutions to the capacitated vehicle routing problem mainly include heuristic search algorithms (such as genetic algorithm, simulated annealing, etc.), integer programming algorithms (such as branch and bound, cutting-plane methods, etc.), and their approximation algorithms. These algorithms can provide relatively accurate solutions, but due to their high complexity and slow convergence speed, they often fail to adapt to real-world application scenarios.

In 1959, Dantzig1 first introduced the concepte of VRP to solve the optimal transportation cost of delivering gasoline to refueling stations. VRP seeks to identify the lowest cost routes for a vehicle fleet, emanating from a central warehouse to a myriad of geographically scattered customers, all while adhering to capacity restrictions2. Presently, conventional approaches to address CVRP encompass techniques such as genetic algorithm3, ant colony optimization4,5, variable neighborhood searche6, simulated annealing7,8, tabu searche9,10, GRASP11, among others.

Cai et al.5 proposed the dynamic space reduction ant colony optimization (DSRACO) to address the capacitated vehicle routing problem. This algorithm introduces an elite reinforcement mechanism and a large-scale neighborhood search approach. Gao et al.12 proposed a hybrid ant colony optimization based on fireworks algorithm (FWA), incorporating elite ant strategy and integrating the max-min ant system (MMAS) to enhance the attraction of local optimal paths for ants. This approach was similarly applied to address CVRP.

Souza et al.13 introduced a new heuristic algorithm based on the differential evolution algorithm with local search (CDELS) to solve CVRP, which combines local search processes such as exchange operations. Teoh et al.14 devised an improved differential evolution algorithm based on local search (DELS). Their approach encompasses three distinct local search techniques: point swapping, point dropping, and flipping. A hybrid variable neighborhood genetic algorithm for solving the post office delivery problem was developed by Sbai et al.15 It is a typical application of CVRP. Faiz et al.16 proposed a novel perturbation-based variable neighborhood search method. They combined it with an adaptive selection mechanism, referred to as PVNS-ASM. Their method was tested on 21 benchmark instances. Machado et al.17 proposed a hybrid mathematical algorithm for solving CVRP, which incorporates greedy random adaptive search, mathematical modeling, and variable neighborhood search in a two-stage approach.

Pelletier et al.18 introduced and applied a two-stage metaheuristic for solving CVRP. Rezaei et al.19 proposed a multi-population imperialist competitive algorithm with genetic local search (ICAHGS) for solving CVRP. A branch-and-bound technique for CVRP was presented by Rezaei et al.19, revealing that logistical costs can be curtailed by an estimated 5-20% through optimization. Ilhan20 introduced a simulated annealing algorithm with an improved crossover operator to tackle the vehicle routing problem with time windows (VRPTW). Laporte and Semet21 presented a series of crossover improvement strategies. Xiao et al.22 proposed the variable neighborhood simulated annealing (VNSA) algorithm, which is a combination of variable neighborhood search and simulated annealing, and tested it on benchmark instances. Akpinar23 unveiled a hybrid large neighborhood search (LNS) algorithm integrating ant colony optimization’s construction mechanism, termed LNS-ACO, yielding encouraging outcomes. An innovative heuristic algorithm, grounded in tabu search and adaptive large neighborhood search (ALNS), was developed by Kır et al.24 Additionally, a hybrid bat algorithm augmented with path relinking (HBA-PR) was crafted by Zhou et al.25, enriching the optimization framework.

Hosseinabadi et al.26 proposed a novel algorithm called CVRP_GELS, that employs gravitational emulation local search techniques to address the CVRP. Similarly, Altabeeb et al.27 integrated partial mapping crossover with dual mutation operators, combining 2h-opt algorithm and 2-opt algorithm, to introduce a new hybrid firefly algorithm. The algorithm was tested on benchmark instances and yielded promising results.

Qiao et al.28 proposed a modified particle swarm optimization (MPSO) for a vehicle routing problem with soft time windows, introduced an improved adaptive strategy by combining the subtraction function and ladder strategy to adjust inertia weight, and added a jump out mechanism to escape local optimal. Guo et al.29 explored a particle swarm optimization promotion strategy suitable for concrete transport vehicles. Ai and Kachitvichyanukul30 utilized Solution Representation-1 (SR-1) and Solution Representation-2 (SR-2) encoding, employing a particle swarm algorithm to solve CVRP. Tang et al.31 proposed a discrete artificial ecosystem-based optimization (DAEO) for the spherical capacitated vehicle routing problem, incorporating the 2-opt algorithm to simulate the mutation mechanism of organisms within the ecosystem.

In this paper, we introduced a discrete version of the wild horse optimizer (WHO), called the discrete wild horse optimizer (DWHO), specifically designed for solving CVRP. The wild horse optimizer is a novel metaheuristic algorithm conceived by Iraj Naruei and Farshid Keynia to tackle optimization challenges in continuous domains32. While its inception targeted continuous problems, the algorithm has found applications in diverse research areas. Specifically, Ali et al.33 employed the WHO to optimize distributed energy resources within radial distribution networks.

Furthermore, Ali et al.34 harnessed WHO for the management of frequency regulation in hybrid multi-area power systems. In the domain of energy consumption prediction for residential buildings, Vasanthkumar et al.35 leveraged WHO’s capabilities. Moreover, Vasanthkumar et al.36 explored its potential in dynamic energy management for hybrid electric vehicle batteries. To our current understanding, no existing literature has hitherto adapted WHO for discrete problems, especially in CVRP landscape.

This paper is organized as follows. Section “Mathematical modeling of capacitated vehicle routing problem” provides a comprehensive mathematical representation of CVRP. In section “Wild horse optimizer (WHO)”, we elucidate the core principles of WHO. Section “The proposed discrete wild horse optimizer(DWHO)” introduces DWHO combining decoding techniques and local search strategies. In section “Results and discussion”, we present an experimental evaluation of DWHO and engage in a detailed analysis and discussion of the results. We conclude with a summary of our findings and directions for future research in section “Conclusion”.

Mathematical modeling of capacitated vehicle routing problem

The mathematical model for capacitated vehicle routing problem is as follows20. It is represented by a graph G = (N, E), with N denoting nodes as {0, ..., n} and E signifying edges given by {(i,j); \(\textit{i,j} \in N\)}. The depot is symbolized by node 0, while customers are denoted by \(N\setminus \{0\}\). Each customer \(i\in N'=N-\left\{ 0 \right\}\)has a demand \(q_{i}\) , where i = 1, 2, ..., n. Each edge has a cost \(TD_{ij}\) has a demand \(q_{i}\) for i = 1, 2, ..., n. The cost for each edge is given as \(TD_{ij}\), representing the travel distance between customers i and j. All vehicles in the set V share a uniform capacity limit Q.

The objective function:

$$\begin{aligned} Minimize \sum _{i=0}^{N}\sum _{j=0}^{N} \sum _{k=1}^{K}TD_{ij} X_{ijk} \end{aligned}$$
(1)

Subject to:

$$\begin{aligned} & \sum _{k=1}^{K} \sum _{i=0}^{N} X_{ijk} =1;\forall { j}\in \left\{ 1,2,\dots , N \right\} ; i\ne j \end{aligned}$$
(2)
$$\begin{aligned} & \sum _{k=1}^{K} \sum _{j=0}^{N} X_{ijk} =1;\forall { i}\in \left\{ 1,2,\dots , N \right\} ; i\ne j \end{aligned}$$
(3)
$$\begin{aligned} & \sum _{i=0}^{N} \sum _{j=0}^{N} X_{ijk}< Q \end{aligned}$$
(4)
$$\begin{aligned} & \sum _{j=0}^{N} X_{0jk}=1,\forall {k}\in V \end{aligned}$$
(5)
$$\begin{aligned} & \sum _{i=0}^{N} X_{0ik}=1,\forall {k}\in V \end{aligned}$$
(6)
$$\begin{aligned} & \sum _{i=0}^{N}X_{ihk} - \sum _{i=0}^{N}X_{ijk}=0,\forall {h}\in N,{k}\in V \end{aligned}$$
(7)
$$\begin{aligned} & X_{ijk}\in \left\{ 0,1 \right\} , i\ne j,\forall {i,j}\in N,{k}\in V \end{aligned}$$
(8)

The decision variable:

$$\begin{aligned} X_{ijk} = {\left\{ \begin{array}{ll} 1, \hspace{5.0pt}if \hspace{5.0pt}vehicle\hspace{5.0pt}k \hspace{5.0pt}traves\hspace{5.0pt}from\hspace{5.0pt}i \hspace{5.0pt}to \hspace{5.0pt}j \\ 0,\hspace{5.0pt}otherwise \end{array}\right. } \end{aligned}$$
(9)

Where, equation (1) is the objective function aimed at minimizing the cumulative distance covered by the vehicles. Equations (2) through (8) delineate a series of constraints , detailed as follows: (i) Equations (2) and (3) guarantee that customers can only be serviced by a singular vehicle. (ii) Equation (4) mandates that the aggregate demand of any given route remains within the vehicle’s capacity constraints. (iii) Equation (5) asserts that every vehicle route starts and ends at the central depot. (iv) Equations (5) and (6) ascertain that each vehicle is deployed only once. (v) Equation (7) requires that the quantity of vehicles entering and departing from a node is consistent. (vi) Equation (8) stipulates that a variable can only adopt the values of 0 or 1. (vii) Equation (9) represents the binary decision variable.

Wild horse optimizer (WHO)

The wild horse optimizer is inspired by the social behaviors observed in wild horse populations. A typical wild horse herd comprises stable familial units consisting of a stallion (male horse), one or more mares (female horses), and their respective offspring. The stallion assumes a leadership role, guiding the herd in pursuit of suitable habitats. WHO drawing from behaviors such as grazing, mating, dominance, and leadership, serves as an adept optimization technique tailored for problems in continuous systems. The foundational principles of WHO encompass the subsequent five components:

  1. (a)

    Population initialization. Forming horse herd groups, and selection of leaders. An initial population is randomly generated. Subsequently, this population is segmented into distinct groups. Each group consists of a leading stallion, accompanied by one or more mares and their offspring.

  2. (b)

    Grazing and mating behavior. Foals typically spend most of their time grazing around the herd. In simulating this behavior, the stallion is considered the center of the grazing area, and group members move and search around the leader with different radii, mimicking the grazing behavior of wild horses. A horse from group i departs to join a provisional group, and simultaneously, a horse from group j does the same. Assuming these two horses, being male and female, lack familial connections, mating is plausible. The resultant offspring must vacate the interim group and integrate into a different group, for instance, group k. This sequence encapsulates the natural mating and procreation patterns of horses. All diverse horses undergo this recurrent cycle of departure, mating, and reproduction. Foals’ grazing patterns are quantified by equation (10).

    $$\begin{aligned} X_{i,G}^{j}=2 Ycos(2\pi RY)*(Sta^{j}-X_{i,G}^{j})+Sta^{j} \end{aligned}$$
    (10)

    Sta denotes the position of the leader(stallion), and R is a random number chosen from the range [\(-2\), 2], primarily regulating the angle between individuals and the leader. Y is formulated by equations (11)–(13). WHO initially involves defining parameters such as the population size (popN), stallion ratio (PS), and mating probability (PC). The positions of stallions and foals are denoted as \(X_{i} =(X_{i1},X_{i2},\cdots ,X_{in})\), where \(i \in (1, 2,\ldots , popN)\). Where, \(X_{i}\) represents the i-th individual, while \(popN \in (N+)\) signifies the population size, and n signifies the number of customers. The number of stallions, NS, is calculated as popN * PS, and the number of foals, Nf, is calculated as popN * (1 - PS). The path taken by the i-th horse through the customers is represented as: \(X_{i1} \rightarrow X_{i2}\rightarrow \cdots \rightarrow X_{in} \rightarrow X_{i1}\). The computation of the adaptive mechanism

    $$\begin{aligned} & L=\overrightarrow{R_{1} } <ADP \end{aligned}$$
    (11)
    $$\begin{aligned} & IDX=(L==0) \end{aligned}$$
    (12)
    $$\begin{aligned} & Y=R_{2} \Theta IDX+\overrightarrow{R}_{3} \Theta (\sim IDX) \end{aligned}$$
    (13)
    $$\begin{aligned} & ADP=1-\frac{iter}{maxiter} \end{aligned}$$
    (14)

    Where L is a binary vector composed of 0 and 1 , \(\overrightarrow{R_{1}}\) and \(\overrightarrow{R_{3}}\) are random vectors uniformly distributed within the range [0, 1]. \({R_{2}}\) is a random number from the range [0, 1]. IDX represents the index value returned by the random vector \({R_{1}}\) that satisfies the condition P == 0. \(\Theta\) denotes the dot product. ADP is the coefficient that linearly decreases from 1 to 0, and iter indicates the number of iterations.

  3. (c)

    Group leadership. The leader (stallion) guides the group members to move towards more propitious habitats. If the current group dominates the area, they will utilize that region; conversely, if another group dominates the area, they must leave that location. The habitats symbolize the current optimal solutions.

  4. (d)

    Exchange and selection of leaders. Leadership is determined by fitness values. If there are group members within a group whose fitness is better than that of the current leader, a position swap occurs between the leader and the corresponding group member. The position of the new group leader represents the optimal solution for that group.

  5. (e)

    Saving the optimal solution. Upon comparing the fitness values across all group leaders, the leader possessing the paramount fitness embodies the global optimal solution. This solution is retained for subsequent iterations.

The mating behavior of foals is represented by equation (15):

$$\begin{aligned} \begin{aligned}{}&X_{G,k}^{c} = Crossover(X_{G,i}^{a},X_{G,j}^{b}) i\ne j \ne k, a=c=end \\&\quad Crossover = Mean \end{aligned} \end{aligned}$$
(15)

\(X_{G,i}^{a}\) represents the position of individual a upon re-entering group i after leaving, while \(X_{G,j}^{b}\) represents the position of individual b upon re-entering group j after leaving. \(X_{G,k}^{c}\) represents the position of individual c in group k, generated through the mating of individual a from group i and individual b from group j. The positions within the parentheses in equation (15) denote the positions of their respective parents.

The leader directs the group members towards more favorable habitats. If the present group has dominance over an area, they occupy that region. Conversely, if a different group holds dominance, the former group is obliged to vacate the vicinity. This process can be depicted by equation (16).

$$\begin{aligned} \begin{aligned} Sta_{Gi+1} = {\left\{ \begin{array}{ll} 2Ycos(2\pi RY )*(WH_{best} -Sta_{Gi} ) +WH_{best}\hspace{5.0pt},if \hspace{5.0pt}R_{3} >0.5 \\ 2Ycos(2\pi RY )*(WH_{best} -Sta_{Gi} ) -WH_{best}\hspace{5.0pt},if \hspace{5.0pt}R_{3} \le 0.5\end{array}\right. } \end{aligned} \end{aligned}$$
(16)

Leader exchange and selection: Initially, a leader is randomly chosen to uphold the algorithm’s inherent randomness. As the algorithm progresses, leaders are selected based on their fitness values. If a member within a group exhibits a fitness value surpassing that of the current leader, the positions of both the leader and the respective member undergo updating following equation (17).

$$\begin{aligned} \begin{aligned} Sta_{Gi+1} = {\left\{ \begin{array}{ll} X_{Gi}, \hspace{5.0pt}if \hspace{5.0pt}cost(X_{Gi})<cost(Sta_{Gi}) \\ Sta_{Gi},\hspace{5.0pt}if \hspace{5.0pt}cost(X_{Gi})>cost(Sta_{Gi}) \end{array}\right. } \end{aligned} \end{aligned}$$
(17)

Finally, the fitness values of all group leaders are compared, and the one with the best fitness value is deemed the optimal solution for that iteration. The fitness function is defined by equation (18), in which fitness signifies the individual’s fitness value, and L(c) denotes the travel distance.

$$\begin{aligned} fitness=1/L(c) \end{aligned}$$
(18)

The pseudocode of WHO is presented as algorithm 1.

Algorithm 1
figure a

Pseudo-code of WHO

The proposed discrete wild horse optimizer(DWHO)

Discrete wild horse optimizer (DWHO)

To adapt the wild horse optimizer for CVRP, discrete techniques are integrated into WHO to represent solution, enabling the optimizer to tackle discrete problems. Simultaneously, three local search strategies-swap operation, reverse operation, and insertion operation-are introduced. We hereby name it the discrete wild horse optimizer (DWHO). Li and Yin37 introduced the Largest Ranked Value (LRV) method, primarily utilized in job sequencing for workshop scheduling. This method relies on the largest sorted value using random keys for discrete decoding.

Figure 1
figure 1

A 12-dimensional solution route.

Figure 2
figure 2

Delivery scheme.

Algorithm 2
figure b

Pseudo-code of the proposed algorithm DWHO

Ai and Kachitvichyanukul30 introduced two decoding methods, SR-1 and SR-2, both originally designed for vehicle routing problems. SR-1 uses a 2m-dimensional encoding, while SR-2 adopts a 3m-dimensional encoding, dividing the paths. Although they can transform real-number encodings into integer decodings, the path division increases the computational complexity and decoding difficulty . Prins38 introduced the PRINS decoding method, which is also targeted at solving vehicle routing problems. Nonetheless, it also suffers from the drawback of high computational complexity. Qian et al.39 proposed the largest-order-value (LOV) decoding technique. In this approach, continuous values produced by DE are arranged in descending order, assigning integers based on this order. The highest real number value is given the first integer position, whereas the lowest value receives the highest integer rank. In this study, to more accurately represent the wild horse optimizer, we use real number encoding to represent the positions of the wild horses. Additionally, we utilize LOV decoding technique to decode our solutions.

Figure 3
figure 3

Swap operation.

Figure 4
figure 4

Reverse operation.

Figure 5
figure 5

Insertion operation.

Figure 6
figure 6

A-set benchmark instances results obtained by DWHO.

Figure 7
figure 7

P-set benchmark instance results obtained by DWHO.

Solution representations

To clarify the significance of the representation of CVRP solution, we explain it using a 12-dimensional solution vector, denoted as \(1 \rightarrow 2\rightarrow 3\rightarrow 4 \rightarrow 5 \rightarrow 6 \rightarrow 7 \rightarrow 8 \rightarrow 9 \rightarrow 10 \rightarrow 11 \rightarrow 12\), as depicted in Fig. 1. Assuming the solution refers to a transportation plan, it is carried out by three vehicles, with each vehicle representing a sub-path. The start and end points of the sub-path are denoted by “0”, while other numbers indicate the customers to be served by each vehicle. The three sub-paths are illustrated in Fig. 2a. The overall delivery scheme is shown in Fig. 2b.

Initial population

In the initialization stage, the key parameters for the algorithm are defined, including the population size (popN), stallion ratio (PS), mating probability (PC), maximum number of iterations (MI), and distance matrix. These parameters play a critical role in shaping the behavior and performance of the algorithm throughout its execution. The population size, popN, dictates the number of potential solutions under consideration, and a thoughtful selection of this value can influence the diversity and convergence speed of the optimization process. The stallion ratio, PS, determines the proportion of leaders guiding the groups within the population, influencing the distribution of search efforts among the potential solutions. The mating probability, PC, governs the likelihood of two individuals mating to produce offspring, impacting the exploration and exploitation balance of the algorithm. The maximum number of iterations, MI, serves as a termination criterion, setting the limit for how many times the optimization process will iteratively refine the solutions. In combination, these parameters define the algorithm’s behavior, influence its performance, and shape its ability to find best known solutions or near best known solutions for the given problem.

Local search strategy

To enhance the algorithm’s local search capability, a variable neighborhood search strategy is introduced. This strategy incorporates three distinct types of neighborhood search operations: swap operation, reverse operation, and insertion operation. By employing these operations, the algorithm is able to effectively explore and exploit the solution space, thereby improving its ability to find best known solutions or near best known solutions. The pseudocode of DWHO is presented as algorithm 2.

Swap operation

For a path involving 7 customers, as demonstrated by the sequence \(1 \rightarrow 2\rightarrow 3\rightarrow 4 \rightarrow 5 \rightarrow 6 \rightarrow 7\), two positions are randomly selected for a swap operation. Specifically, the 2nd and 6th customers. This yields the modified path: \(1 \rightarrow 6\rightarrow 3\rightarrow 4 \rightarrow 5 \rightarrow 2 \rightarrow 7\). A schematic representation of this operation is provided in Fig. 3.

Reverse operation

Using the same 7-customer path, \(1 \rightarrow 2\rightarrow 3\rightarrow 4 \rightarrow 5 \rightarrow 6 \rightarrow 7\), two positions are randomly selected to initiate the reverse operation. Choosing the segment between the 2nd and 6th customers, the modified path becomes \(1 \rightarrow 6\rightarrow 5\rightarrow 4 \rightarrow 3 \rightarrow 2 \rightarrow 7\). Figure 4 illustrates the reverse operation.

Insertion operation

Given the path \(1 \rightarrow 2\rightarrow 3\rightarrow 4 \rightarrow 5 \rightarrow 6 \rightarrow 7\), two customers are randomly selected for an insertion operation. If the 2nd customer is chosen to be inserted after the 6th, the resultant path is \(1 \rightarrow 3\rightarrow 4 \rightarrow 5 \rightarrow 6 \rightarrow 2\rightarrow 7\). The insertion operation’s schematic is depicted in Fig. 5.

Table 1 Comparison results of A-set benchmark instances.
Table 2 Comparison of CPU computation times on A-set benchmark instances.
Table 3 Comparison results of P-set benchmark instances.
Table 4 Comparison of CPU computation times on P-set benchmark instances.
Table 5 Wilcoxon test comparison results of A-set benchmark instances.
Table 6 Wilcoxon test comparison results of P-set benchmark instances.

Results and discussion

Parameter setting

In this section, we carry out extensive experiments to verify the performance of DWHO. The experiments encompassed the benchmark instances A-set and P-set40. We compared the experimental results with four algorithms: basic wild horse optimizer(BWHO), hybrid firefly algorithm27, dynamic space reduction ant colony optimization(DSRACO)5 and discrete artificial ecosystem-based optimization(DAEO)31, to validate the effectiveness of DWHO. BWHO is a basic wild horse optimizer that only performs discretization on the obtained results without incorporating optimization and local search strategies. The solving effectiveness is moderate. The hybrid firefly algorithm integrates 2-opt, accelerating the algorithm’s solution speed and enhancing the optimization capability of the firefly algorithm. In DSRACO, ACO is integrated with a dynamic space reduction method, an elite enhanced mechanism, and large-scale neighborhood search methods to improve the solution. The DAEO uses five local search operators to discretize the position update formula of the original AEO algorithm and introduces the 2-opt algorithm to simulate the mutation mechanism of organisms in the ecosystem.

The computational configurations are listed as follows.

  • OS: Windows 10 (x64)

  • CPU: Intel Core i5-11400 (2.60 GHz)

  • RAM: 16GB

  • Language: Matlab 2016B

The following are the relevant parameter settings for DWHO.

  • Population size: popN=50

  • Stallion ratio: PS=0.2

  • Mating probability: PC=0.13

  • Maximum number of iterations: MI=1500

  • Number of Stallions: NS=popN*PS=10

  • Number of foals: Nf=popN*(1-PS)=40

For the sake of fairness in comparison, we set the number of iterations to 1500. Each experiment is conducted 20 times. The experiments terminate upon either reaching the maximum number of iterations or obtaining an early optimal solution. Concurrently, the Gap is defined by the equation (19). All algorithms were calculated on the same machine.

$$\begin{aligned} Gap= (BS - BKS) / BKS * 100\% \end{aligned}$$
(19)

Where BS denotes the optimal solution achieved by the DWHO, BKS represents the best known solution for the benchmark instance, and Gap signifies the error of BS.

To enhance the diversity of the algorithm’s search space, we introduced swap, reverse, and insertion operations in our experiments. The inclusion of local search strategies also improves the algorithm’s solving capability. During each swap operation, two nodes are randomly selected for swapping. Similarly, for the reverse operation, two nodes are randomly selected, and the points between them are reversed. For the insert operation, two nodes are randomly selected. Each type of operation is performed 50 times.

Comparison results of A-set benchmark instances

The experimental results of DWHO, BWHO, hybrid firefly algorithm, DSRACO, DAEO are compared using A-set benchmark instances as the experimental dataset.The comparison results are shown in Table 1. From Table 1, DWHO matches or closely approximates BKS in numerous instances, it is evident that for small scale instances, the solutions of DWHO are equivalent to the best known solutions. For medium scale and large scale instances, the solution Gaps of DWHO are relatively small, mostly around 1%. Compared with BWHO, DWHO exhibits significant advantages across all instances, greatly enhancing solution accuracy. In comparison with hybrid firefly algorithm, DWHO also holds a notable advantage, as the solutions of most instances outperform those of hybrid firefly algorithm. When compared to DSRACO, DWHO similarly demonstrates outstanding performance, with only a slightly weaker solving capability for large scale instances. In Table 1, data marked with * and highlighted in bold signify solutions obtained by DWHO that are either equivalent to the best known solutions or the best among the five algorithms. Similarly, the solving capability of DWHO is remarkable compared to DAEO. On small-scale instances, both algorithms perform comparably, while on medium and large-scale instances, DWHO significantly outperforms DAEO. We also compared the CPU runtime of each algorithm on the A-set benchmark instances, measured in seconds, summarized in Table 2. As shown in Table 2, BWHO has the shortest runtime due to the lack of local search strategies. DWHO demonstrates a clear advantage over the other three algorithms on small and medium-scale instances. Although it is not as fast as DSRACO on large-scale instances, it shows a noticeable advantage over hybrid firefly algorithm and DAEO.

As depicted in Fig. 6, the route diagrams for A-n32-k5, A-n33-k6, A-n55-k9, and A-n64-k9 indicate that the paths derived using DWHO for these instances are rational and clear.

Comparison results of P-set benchmark instances

By utilizing P-set benchmark instances as the experimental dataset, a thorough examination of the experimental outcomes of DWHO is conducted in comparison with BWHO, hybrid firefly algorithm , DSRACO, and DAEO. The outcomes of this comparative analysis are showcased in Table 3. Just as observed from Table 3, it becomes evident that, particularly for small scale instances, DWHO consistently secures optimal solutions that align precisely with the best known solutions. For instances of medium scale and larger scale, DWHO effectively minimizes solution discrepancies, predominantly within the narrow margin.

In P-set benchmark instances, DWHO consistently demonstrates a pronounced advantage, substantially enhancing solution accuracy. The solution precision of DWHO distinctly surpasses the solution capabilities of BWHO, consistently producing superior results across all P-set benchmark instances. Relative to hybrid firefly algorithm, DWHO retains a significant edge; in most instances, its solutions outperform those of hybrid firefly algorithm. When juxtaposed with DSRACO, DWHO maintains commendable performance, particularly in larger scale instances. For instance, in large-scale problems such as P-n76-k4, P-n76-k5, and P-n101-k4, DWHO’s problem-solving prowess evidently outshines DSRACO. Similarly, DWHO exhibits superior solving capability over DAEO on most medium- and large-scale instances. Additionally, in instances like P-n22-k8, P-n55-k8, and P-n55-k15, the results achieved surpass the best known solutions. It is particularly noteworthy that for the large scale instance P-n101-k4, while other algorithms under comparison fail to ascertain the best known solution, DWHO succeeds, attesting to the efficacy of our introduced DWHO algorithm. In Table 3, data marked with * and highlighted in bold signify solutions obtained by DWHO that are either equivalent to the best known solutions or the best solutions among the five algorithms, while data marked with ** and highlighted in bold indicate solutions that are superior to the best known solutions. On the P-set benchmark instances, the CPU runtime of DWHO, compared to other algorithms, is significantly faster than the other three algorithms, except for BWHO. This also demonstrates the effectiveness of our proposed algorithm. The results are shown in Table 4.

As shown in Fig. 7, from the route diagrams of P-n22-k2, P-n45-k5, P-n55-k7, and P-n101-k4, it can be inferred that DWHO effectively retrieves their optimal solutions.

Effectiveness verification

There are five involved algorithms in this experiment. The experiment is conducted on the A-set benchmark instances and P-set benchmark instances, and each instance is run 20 independent times, and the mean value and standard variation are recorded as the final results. To make the comparison results statistically sound, we use the Wilcoxon’s rank sum test, a nonparametric statistic test, for a single-instance analysis, the significance level \(\alpha\) is set to 0.05. The symbols “+”, “=”, and “-” denote that DWHO is significantly better than, similar to, and significantly worse than its competitor on an instance, respectively, according to the p value. The final results of the above five involved algorithms are shown in Tables 5 and 6.

As shown in Table 5, DWHO significantly outperforms BWHO, hybrid firefly algorithm and DAEO on A-set benchmark instances, and it is comparable to DSRACO. When compared to DSRACO, DWHO is only slightly weaker on a few instances. Table 5 demonstrates that DWHO completely outperforms BWHO. In comparison with hybrid firefly algorithm, DWHO shows superiority in 12 instances, equivalence in 8 instances, and no instances of inferiority. When compared to DSRACO, DWHO is superior in 1 instance, equivalent in 13 instances, and slightly inferior in 6 instances. Against DAEO, DWHO excels in 14 instances, is equivalent in 4 instances, and is inferior in 2 instances. Similarly, Table 6 demonstrates that DWHO consistently outperforms BWHO in all evaluated instances of the p-set. When compared to the hybrid firefly algorithm, DWHO exhibits superior performance in 17 instances, is equivalent in 7, and does not perform inferiorly in any instance. In contrast to DSRACO, DWHO shows superiority in 11 instances, similarity in 7, and is relatively inferior in 6 instances. Compared to DAEO, DWHO is superior in 13 instances, equivalent in 7, and inferior in 4 instances. As observed from the data, DWHO demonstrates superior performance compared to the other four compared algorithms on both A-set and P-set benchmark instances. This assertion is grounded on a rigorous assessment that incorporated wilcoxon’s rank and sum test.

Conclusion

In this paper, we incorporate the wild horse optimizer from the engineering domain to propose a discrete wild horse optimizer for solving CVRP. Based on the foundation of WHO, three neighborhood local search strategies - swap operation, reverse operation, and insertion operation - are applied to enhance both the search capability and solving capability of DWHO for CVRP. Moreover, a decoding technique is introduced to allow DWHO solutions to represent vehicle routing schemes. Experiments on 44 benchmark instances were conducted, and the results indicate that DWHO with decoding technique can effectively address discrete combinatorial optimization problems, especially CVRP, with satisfactory performance. DWHO was compared with BWHO, hybrid firefly algorithm, DSRACO, and DAEO. The comparative results revealed that DWHO possesses robust problem-solving abilities. It significantly outperforms BWHO, and generally, the solutions from DWHO are superior to those obtained from hybrid firefly algorithm, DSRACO and DAEO. In A-set benchmark instances, DWHO achieves the best known solutions for most medium scale and small scale instances. For larger sclae instances, the solution deviations are relatively minor. As for P-set benchmark instances, not only does DWHO attain the best known solutions for medium scale and small scale instances, but it also accomplishes this for the large sclae instance P-n101-k4, a feat the other four compared algorithms could not achieve. This highlights the efficacy and superiority of our proposed algorithm. In future work, we will explore the potential of DWHO in solving other discrete combinatorial optimization problems, aiming to further broaden the applicability of the algorithm.