Introduction

The behavioral and signaling patterns of social animal groups1,2 have sparked extensive research into collective behavior and decision-making, primarily to understand the underlying mechanisms that drive these emergent properties3. Inhibitory signals, in particular, play an essential role in social insects, fine-tuning collective decision-making and coordinating critical tasks such as house-hunting and foraging4,5,6,7,8. These inhibitory signals, often communicated through vibrations or tactile interactions, allow colonies to efficiently allocate resources and labor. For instance, in honeybees, stop signals can prevent the recruitment of additional foragers to poor or perilous food sources, thereby optimizing foraging efforts4,5,6,7. Similarly, during nest site selection, bees use stop signals to halt the promotion of less suitable sites, ensuring that the colony converges on the best available option8. By integrating these stop signals, social insects enhance their ability to make adaptive and robust decisions, ultimately supporting the survival and success of the colony. The fascinating social behavior of honeybees, including their intricate recruiting signaling patterns such as the waggle dance9, has inspired the design of decentralized decision-making algorithms10,11,12,13,14, and their application to robotic systems15.

According to Nieh4,7 and Pastor et al.5, during foraging tasks, honeybees’ stop signals can alter the probability of waggle dancers ceasing their dance and leaving the nest, thereby reducing recruitment. However, dancers do not exhibit an immediate response to these signals. This feature is characteristic of modulatory signals, which are produced in various contexts and are known for subtly shifting the probability of receiver behaviors based on their response thresholds. Lau et al.6 further suggested that, depending on receiver response thresholds, stop signals do not exert a strong colony-wide effect until signaling levels are sufficiently elevated. A similar mechanism has for long also been observed in brain neuronal activity16, where balance between excitation and inhibition is critical for processes such as sensory perception, motor control or cognitive functions. Recent efforts have been made in establishing the similarities between individual decision-making in primate brains and collective decision-making in social insect colonies17,18,19.

Field experiments on honeybee house-hunting8 showed that stop signals are also pervasive in these scenarios. This study introduced the term cross-inhibition, as it demonstrated that stop signals were predominantly exchanged between agents promoting competing options. Cross-inhibition has proven essential for resolving deadlocks in decisions between very similar alternatives8,10,20,21,22. However, as argued in ref. 22, cross-inhibition trades accuracy for stability. This means the system can confidently make a decision for any option, regardless of whether it is the highest quality one or not. In value-based tasks, this trade-off may not necessarily be detrimental, as the system prioritizes making a choice that yields a sufficiently high reward within a limited time, thus balancing the speed-value trade-off22,23,24,25. Furthermore, depending on the intensity of cross-inhibition, this mechanism may pause the decision-making process if the qualities of the available options are not deemed high enough, allowing the system to wait for a potentially better option to appear10,12. Such a system transitions from indecision to decision through pitchfork or saddle-node bifurcations14,26, controlled by the model parameters.

In honeybee-inspired collective decision-making models, the cross-inhibition rate has usually been considered a linear function of the population sending the stop signals. This choice represents the simplest modeling assumption, where the abandonment of one’s opinion is linearly proportional to the accumulation of stop signals received from peers with opposing options. However, similar to the foraging behavior of bees discussed earlier, Seeley et al. also suggested that nest-site scout waggle dances are likely terminated when stop-signal inhibition surpasses a certain threshold8. Motivated by this experimental evidence, here we investigate the impact of non-linear inhibitory responsiveness14,26,27 within honeybee-inspired decision-making models. The response depends on the amount of stop signals received and diminishes or becomes negligible when stop signals are sparse, see Fig. 1a. This approach also aligns with the concept of complex social contagion models28,29, which posits that multiple exposures to a given opinion are required to trigger a shift in belief. Similarly, our model assumes that a minimum threshold of stop signals must be reached before initiating the cross-inhibition response.

Fig. 1: Non-linear inhibitory responses and their effects on collective decision-making.
figure 1

a Strength of the cross-inhibition non-linear responses as a function of the population fraction that is sending the inhibitory signals, fβ. Different lines represent responses that will be studied throughout the text, separated into two panels for clarity. On the left, a smooth sigmoid, σ1(fβx0 = 0.333, a = 20) (light blue), and a sharp sigmoid σ1(fβx0 = 0.3, a = 500) (dark blue), are depicted. On the right a smooth linearly bounded sigmoid, σ2(fβx0 = 0.3, a = 10) (light red), and a sharp linearly bounded sigmoid σ2(fβx0 = 0.3, a = 500) (dark red), are depicted. The parameter x0 controls the ascent of the sigmoid and the parameter a controls the smoothness of the ascent (see Eq. (2) for more details). The black dotted line indicates a linear cross-inhibition response. b Bifurcation diagrams on increasing interdependence λ for linear cross-inhibition (black circles), a sharp sigmoid cross-inhibition function σ1(fβx0 = 0.3, a = 500) (blue squares), and a smooth bounded sigmoid function σ2(fβx0 = 0.3, a = 10) (red triangles). The case \({\lambda }^{{\prime} }=0\) is included as a continuous line for comparison. Other model parameters used are π1 = π2 = 0.1, q1 = 9, q2 = 10 and \({\lambda }^{{\prime} }=1\).

Focusing on binary decision tasks, we demonstrate that our approach enhances the consensus formation capabilities of decentralized systems compared to linear cross-inhibition models, particularly when dealing with options of similar qualities. The benefits are twofold: first, the final decision is achieved with virtually no bees committed to the less favored option; second, the time to reach a stationary state is significantly reduced.

Results and discussion

Model

We use a simplified version of the model proposed by List et al.30, referred to as the LES model, which is inspired by the house-hunting behavior of honeybees and serves as the basis of our decision-making framework. This agent-based model incorporates the typical discovery, abandonment, and recruitment transitions found in other collective decision-making models10,11,12,13. Although the original model did not include cross-inhibition interactions, extending it to incorporate this mechanism is straightforward.

In the LES model, a swarm of N scout bees, indexed by i = 1, …, N, evaluates k potential nest sites, indexed by α = 1, …, k. Each site α is characterized by an intrinsic quality qα ≥ 0 and a spontaneous discovery probability πα ≥ 0. Bees can be in any of k + 1 states: uncommitted or committed to one of the k available sites. The transitions from uncommitted to committed state are governed by discovery and recruitment rates, which represent individual and social behavior, respectively, and are balanced by an interdependence parameter λ. Likewise, the transitions from committed to uncommitted states are governed by abandonment and cross-inhibition rates, which reflect individual and socially motivated behaviors, respectively.

The model’s mean-field rate equations for the fractions of agents committed to each site, fα(t), can be derived using the master equation formalism31. Including cross-inhibition, these equations are:

$${\dot{f}}_{\alpha }(t)= \,{f}_{0}(t)\left[(1-\lambda ){\pi }_{\alpha }+\lambda {f}_{\alpha }(t)\right]\\ -{r}_{\alpha }{f}_{\alpha }(t)-{\lambda }^{{\prime} }{f}_{\alpha }(t)\sum\limits_{\beta \ne \alpha }\sigma ({f}_{\beta }),\,\,\alpha =1,\ldots ,k$$
(1)

where \({f}_{0}(t)=1-{\sum }_{\alpha = 1}^{k}{f}_{\alpha }(t)\) is the fraction of uncommitted bees. The discovery rate, (1 − λ)πα, refers to the rate at which uncommitted bees discover and commit to site α, and the recruitment rate λfα represents the rate at which uncommitted bees are recruited by peers already committed to option α. The rate rα at which bees stop advertising a site is inversely proportional to its quality, rα = 1/qα. Finally, the cross-inhibition rate, \({\lambda }^{{\prime} }{f}_{\alpha }\sigma ({f}_{\beta })\) (β ≠ α), is the rate at which bees abandon their options after receiving stop signals from those advocating for competing options. Here, \({\lambda }^{{\prime} }\) regulates the intensity of cross-inhibition interactions. For the purposes of this study, we will set \({\lambda }^{{\prime} }=1\).

The stationary points of the system can be determined by numerically solving the equations obtained by setting \({\dot{f}}_{\alpha }(t)=0\)32. Without cross-inhibition (\({\lambda }^{{\prime} }=0\)), the system simplifies to the expressions derived in ref. 31. This particular system has been thoroughly analyzed in refs. 32,33.

Non-linear cross-inhibition response

The function σ(fβ) in Eq. (1) determines the actual strength of the cross-inhibition based on the fraction of bees fβ sending stop signals. Traditionally, cross-inhibition, similarly to recruitment interactions, has been modeled as a proportional response to the fraction of adversary population, i.e., σ(fβ) = fβ. Here we consider a non-linear cross-inhibition response. Specifically, we propose two sigmoid-like test functions, where the cross-inhibition strength remains weak for small values of the inhibiting population:

$${\sigma }_{1}({f}_{\beta };{x}_{0},a) = \frac{1}{1+{e}^{-a({f}_{\beta }-{x}_{0})}}\,\Theta ({f}_{\beta }),\\ {\sigma }_{2}({f}_{\beta };{x}_{0},a) = \frac{{f}_{\beta }}{1+{e}^{-a({f}_{\beta }-{x}_{0})}}.$$
(2)

The parameter a controls the steepness of these functions, and x0 is a threshold controlling the sigmoid’s ascent position. In σ1, the Heaviside step function (Θ(x) = 1 if x > 0, else Θ(x) = 0) ensures that the cross-inhibition response is turned off when there is no inhibiting population (fβ = 0). Some instances of these functions, tested later on the nest-site selection dynamics, are depicted in Fig. 1a. The function σ1 captures the scenario where the cross-inhibition strength increases sharply, similar to a step function, once the threshold population x0 is approached. On the other hand, σ2 assumes that the cross-inhibition strength grows sub-linearly below this threshold and transitions to a limiting linear regime above it. The threshold x0 represents the population fraction at which cross-inhibition begins to have a significant effect. In this study, we have fixed x0 to be approximately one third of the total population. This choice ensures that a sufficiently large committed population, formed through quality-sensitive communication, is established before cross-inhibition takes effect. A lower x0 would favor the option that gains an early advantage due to random fluctuations, while a higher x0 would delay cross-inhibition until interdependence is strong and a leading option has already emerged. The sigmoidal functions shown in Fig. 1a have slightly different x0 values to ensure that both non-linear functions surpass the strength of the linear model at the same population fraction. This adjustment allows for a fair comparison of their impact on the decision-making process.

Fixed point analysis

In the following, we will focus on the simplest case of a binary decision between two sites that differ in quality (q1 < q2). The system’s dynamics display a different number of stable points for different values of the model’s parameter. Due to the analytical complexity of the model’s equations, we resort to numerical methods to obtain the different fixed points (see “Methods”).

Increasing the strength of the social interactions leads to an (unfolded) pitchfork bifurcation34 between one stable fixed point and two (asymmetric) stable fixed points separated by a saddle node. This behavior is shown in Fig. 1b for linear cross-inhibition (black-circle curve), and has been previously observed in similar models10,11,12. When switching to non linear cross-inhibition, a bifurcation still occurs, but its position depends on the specific non-linear cross-inhibition function chosen. Two examples of this are also shown in Fig. 1b. The curves with blue squares and red triangles represent results for a sharp sigmoid function σ1(fβ; 0.3, 500) and a smooth linearly-bounded sigmoid function σ2(fβ; 0.3, 10), respectively. Results obtained for other non-linear functions displayed in Fig. 1a are shown in Supplementary Fig. SF1. Reducing the strength of cross-inhibition, or slightly varying the threshold parameter x0, produces qualitatively similar results, though the positions of the bifurcations are shifted. For bifurcation plots at \({\lambda }^{{\prime} }=0.5\), see Supplementary Fig. SF2.

Performance measure

To assess the model’s performance with non-linear cross-inhibition interactions, we numerically evaluate the stationary fixed point values, focusing on the occupation fraction for the best-quality site, \({f}_{2}^{* }\). This quantity represents the decision accuracy of the system. However, as previously discussed, decision accuracy alone is not the only relevant variable a system seeks to maximize, especially in value-based decisions22,24,35.

In scenarios where the available sites are similar in quality, it may be preferable to make a quick decision rather than spending a large amount of time to choose a slightly better site. Therefore, in addition to accuracy, we use agent-based stochastic simulations to measure two additional performance metrics: (1) the probability \(P({f}_{2}^{* })\) of reaching the best option; and (2) the time tss required to settle into this stationary state. See “Methods” for simulation details. These complementary quantities provide a comprehensive evaluation of the system’s decision accuracy and speed performance.

Figure 2 represents the behavior of these three quantities as a function of the interdependence λ for close values of the sites’ qualities q1 = 9 and q2 = 10. Non-linear cross-inhibition results (see Eq. (2)) are shown together with the results of a linear cross-inhibition. In the non-linear case, we use the same parameters as in Fig. 1. We can observe that all non-linear cross-inhibition functions tested outperform the linear cross-inhibition in terms of pure consensus accuracy. However, the linear approach provides a higher probability of selecting the better option. These differences are particularly relevant for small to moderate values of the interdependence parameter, especially when the linear model has only one stable fixed point. In this regime, the system must balance independent discoveries and recruitment to build consensus for either option. Not triggering cross-inhibition unless an option has gained some representation allows the system to build a stronger consensus, albeit at the risk of less reliably choosing the better option. Nonetheless, this comes with the benefit of making a decision in a much shorter time, as shown in Fig. 2c. This can be a significant advantage when choosing between similarly valued options. As reported in refs. 27,36, quicker consensus can be achieved by allowing the system to first build sub-populations of comparable sizes before triggering competition between them. In those works, this is achieved by time-varying social interaction rates, including recruitment and cross-inhibition. In contrast, we propose a time independent mechanism that weakens the perception of cross-inhibition signals unless they are received from a significant portion of the population. This approach allows both populations to grow without interference from stop signals, either by pooling environmental cues or peer opinions. Once the populations reach substantial sizes, cross-inhibition is triggered, and a faster decision is made.

Fig. 2: Comparison between linear and non-linear cross-inhibition responses in a binary-decision problem.
figure 2

a Occupation fraction for the best-quality site, \({f}_{2}^{* }\). b Probability of reaching the best option, \(P({f}_{2}^{* })\). c Time to settle into the stationary state, tss. Other model parameters are π1 = π2 = 0.1, q1 = 9, q2 = 10, and \({\lambda }^{{\prime} }=1\), for a system of size N = 1000. Error bars indicate the standard error of the mean.

Each type of sigmoid function is tested with both a sharp response (high a = 500 value), where the cross-inhibition rapidly shifts from no effect to maximum or linear bound, and a smooth response (low a [10, 20] value), where the transition to the final bound is more gradual. Interestingly, our results for the best-quality site occupation fraction show a remarkable insensitivity to the specific details of the sigmoid functions (see Fig. 2a). Moreover, these results are significantly higher for low and moderate interdependence compared to those of the linear cross-inhibition model. On the other hand, the probability of retrieving the best option is considerably reduced for sharper cross-inhibitory responsiveness, independently of the function selected, Fig. 2b. This is due to the indiscriminate action of inhibition on the option that first reaches the activation threshold x0, irrespective of its quality. While the smooth sigmoid also yields probabilities similar to the sharp functions, due to the over-representation of the inhibiting population when the threshold is trespassed, approaching smoothly the linear bound grants an intermediate result. It is also worth mentioning the significant reduction in deliberation time achieved with a non-linear cross-inhibition response, which occurs almost independently of the specific choice of non-linear function.

In order to encapsulate the effect of these three measures in a single quantity that summarizes the performance of non-linear cross-inhibition, we define the objective performance:

$${\psi }_{\sigma }=\frac{{f}_{2}^{* }\,P({f}_{2}^{* })}{{t}_{ss}},$$
(3)

weighting the three quantities at stake. To assess how non-linear cross-inhibition compares to linear cross-inhibition, we also introduce the performance ratio χ = ψσ/ψlin. Figure 3 depicts this performance ratio for three pairs of site qualities, (q1 = 8, 9, 9.5 while q2 = 10). Supplementary Fig. SF3 shows results for smaller values of q1. In any case, we observe a performance ratio χ > 1 for nearly all values of λ. Moreover, this ratio increases as the site qualities become closer, indicating a more significant performance improvement when using non-linear cross-inhibition.

Fig. 3: Performance of non-linear cross-inhibition responses under varying interdependence.
figure 3

Performance ratio χ of non-linear cross-inhibitory responses on increasing interdependence λ, in a binary-choice scenario. Three quality pairs are represented in a: (q1 = 8, q2 = 10), b (q1 = 9, q2 = 10) and c (q1 = 9.5, q2 = 10). Other model parameters are \({\pi }_{1}={\pi }_{2}=0.1,{\lambda }^{{\prime} }=1\) and system size N = 1000. Error bars indicate the propagated standard error of the mean.

The performance improvement peaks around λ ~ 0.2, corresponding to the point where the difference in decision times between the linear and nonlinear model is the greatest. As interdependence increases, the performance improvement diminishes because the three quantities become more similar across models. Nevertheless, non-linear responses still yield better overall performance.

The decrease in performance improvement with higher λ is due to the combined effect of interdependence and cross-inhibition driving the losing population to very low fractions, while the winning population dominates (apart from a small uncommitted fraction). In this scenario, the cross-inhibition strength exerted by the winning population on its adversary becomes similar to that in the linear model, regardless of the specific non-linear response chosen. The advantage of the non-linear response is mainly due to the weaker effect of the losing population’s cross-inhibition. Furthermore, as noted in ref. 32, when λ → 1, the system can make a strong decision without cross-inhibition, although incorporating cross-inhibition significantly reduces decision time.

Comparing different non-linear cross-inhibition functions, we find that their performances are relatively close, with the smooth, linearly bounded sigmoid being the only one that underperforms. The effectiveness of a strong, sudden activation of cross-inhibition was previously reported by Talamali et al.27. However, while their study primarily noted an improvement in choosing accuracy without significantly affecting decision time, our approach demonstrates a comprehensive enhancement in both accuracy and decision time.

One important question is the effect of system size on the relative performance of linear and non-linear cross-inhibition, as empirical studies suggest that colony size influences the effectiveness of cross-inhibition in real swarms (see, e.g., ref. 37). Smaller system sizes exhibit larger fluctuations, which can impact the ability of the swarm to consistently select the best option. To investigate this issue, we conducted numerical simulations for system sizes ranging from N = 50 to N = 1000. These simulations allow us to assess how fluctuations influence the probability of selecting the best option \(P({f}_{2}^{* })\) and the time required to reach a stable consensus tss. In Supplementary Fig. SF4, we show \(P({f}_{2}^{* })\) and the time tss for different system sizes. Finite-size fluctuations, which become more pronounced as the system size decreases, directly impact the probability of selecting the best option, as smaller systems are more susceptible to random fluctuations that can drive them toward an incorrect consensus state. In particular, these effects are more pronounced in the linear model. On the other hand, the effect on tss is more nuanced. For small system sizes, the linear cross-inhibition model struggles to reach a true stationary state for all values of λ, particularly before or near the bifurcation point. Instead of stabilizing, fluctuations continuously drive the system between different consensus states indefinitely (see Supplementary Fig. SF5). This explains why tss cannot be properly identified for small λ and N = 50, 100 in SF4 (right). As a consequence, for these values, \(P({f}_{2}^{* })\) does not strictly measure the fraction of realizations in which the highest-quality option is chosen, and we discard them. In Fig. 4 (R5), we plot the performance ratio as a function of λ for different values of N. Other conditions are reported in Supplementary Fig. SF6. Our results suggest that while finite-size effects play a role, particularly in smaller systems, non-linear cross-inhibition maintains its advantage by mitigating unwanted fluctuations more effectively than linear cross-inhibition.

Fig. 4: Performance of a non-linear cross-inhibition response with varying system size.
figure 4

Performance ratio χ of the sharp sigmoid non-linear cross-inhibition response σ1(x0 = 0.3, a = 500) on increasing interdependence λ for different system sizes N. Other model parameters are \({\pi }_{1}={\pi }_{2}=0.1,{\lambda }^{{\prime} }=1,{q}_{2}=10\), and q1 = 8 (a) and q1 = 9.5 (b), as indicated in each plot. Error bars indicate the propagated standard error of the mean. The inset in a shows the non-linear cross-inhibition function used, indicating the cross-inhibition strength σ as a function of the inhibiting population fβ.

The performance ratio allows us to assess the effect of different interaction patterns in various scenarios. So far, we have tested consensus dynamics by fixing the discovery probabilities and varying the swarm’s interdependence. Increasing interdependence reduces the amount of individual exploration by prioritizing peers’ options. This strategy has been shown to optimize consensus accuracy, even in the absence of cross-inhibition30,32, although it may extend decision time32. When the discovery probabilities increase (with fixed λ), the system more readily incorporates environmental information. This reduces decision time but leads to poorer final consensus, especially when options are of similar quality32. The consensus decrease in this situation is caused by the fact that environmental information is socially unfiltered, i.e., it is incorporated at a rate that is independent of how much of the population is already advertising a given choice. Thus, πα can also be viewed as a noise parameter hampering overall accuracy. In such scenarios, cross-inhibition is crucial both to maintain high group cohesion (most of the population committed to the same option) and to avoid deadlocks, as reported in various case studies10,11,20,22.

Figure 5 shows the performance ratio of non-linear cross-inhibition responses as discovery probabilities increase, for fixed values of quality and interdependence. The corresponding performance variables are plotted in Supplementary Fig. SF7. As the noise in the system increases, non-linear cross-inhibition yields better performance. Weakening the stop signals from the losing population, whichever its quality, becomes essential in this context: although uncommitted agents continuously introduce “incorrect" information at a steady rate, the stop-signaling mechanism ensures that a dominant option can suppress this noise, thereby maintaining a high level of consensus. Interestingly, examining the individual quantities \({f}_{2}^{* }\) and tss on increasing π1 = π2 = π1,2, we observe opposing trends for the linear and non-linear model. The linear response yields decreasing \({f}_{2}^{* }\) while increasing tss; in contrast, non-linear responses reverse this trend. Consequently, the model performance improves as π1,2 increases. Comparing the different non-linear responses, we observe that the performance is consistently higher for the standard sigmoid functions than for the linearly bounded sigmoids. In each case, the sharp response also grants better performance.

Fig. 5: Performance of non-linear cross-inhibition responses under varying options’ discovery probabilities.
figure 5

Performance ratio χ of non-linear cross-inhibition responses as a function of spontaneous discovery probabilities π1 = π2 ≡ π1,2, in a binary-decision choice. Other parameters are \({q}_{1}=9,{q}_{2}=10,\lambda =0.6,{\lambda }^{{\prime} }=1\) and N = 1000. Error bars indicate the propagated standard error of the mean.

Although this study focuses on a binary decision process, the model can be extended to scenarios involving more than two alternatives12. In such cases, the advantages of non-linear models become even more pronounced, as demonstrated in Supplementary Fig. SF8 for a five-option scenario. When cross-inhibition is linear, the system settles into a stationary deadlock, with no single option dominating. In contrast, with non-linear cross-inhibition, the system initially appears stuck in indecision. However, due to stochastic fluctuations, one option eventually gains an advantage, leading to an asymmetric distribution of commitment. This outcome stems from the reduced inhibitory effect of smaller populations: in the linear model, all options exert mutual inhibition, preventing a clear decision. By contrast, non-linear cross-inhibition allows one option to exert a disproportionately strong inhibitory signal once it gains enough quorum, ultimately breaking the deadlock and establishing consensus.

Finally, it is worth noting that non-linear inhibitory responses can be broadly applied to various modeling scenarios. For instance, Reina et al.22 propose a model where quality explicitly modulates the strength of recruitment and cross-inhibition interactions, while options’ discovery and abandonment are considered noise, a quality-independent parameter. Modulating cross-inhibition interactions while varying this noise parameter produces results very similar to Fig. 5.

Conclusions

Here we investigate non-linear cross-inhibition interactions in decentralized decision making models inspired by house-hunting honeybees. The primary design goal is to weaken an individual’s response to stop signals when they are received from a small fraction of the population. We model this behavior using two non-linear functions, tested with different parameters (Fig. 1). Focusing on binary decision tasks, we demonstrate that non-linear cross-inhibition results in higher consensus (the fraction of the population committed to the chosen option) and quicker decisions. These two benefits come at the cost of reducing accuracy in reliably choosing the best quality option. Nonetheless, in decisions made among options with close qualities, a stronger and quicker decision for a “good enough” option may be more beneficial than a weaker consensus or a slower decision process that yields the absolute best option22,24,35. Moreover, while we have focused on scenarios where a single best option exists, both linear and non-linear models can also break symmetry between equally valued alternatives. In such cases, our key findings–namely, the superior performance of non-linear models in terms of consensus formation and decision time–remain valid. Our results thereby open promising avenues for future research in decentralized collective decision-making and practical applications in swarm robotics. In this context, the predictions of our mean-field analysis could be tested through experiments with robot swarms, which can be easily programmed to respond non-linearly to stop signals from their nearest neighbors.

Methods

Numerical analysis of the LES model with cross-inhibition

Linear cross-inhibition

When the cross-inhibition response is linear, the set of equations Eq. (1) can be combined to derive an equation for the fixed points of the uncommitted population. In each equation, the cross-inhibition term can be reformulated as \(-\lambda {f}_{\alpha }{\sum }_{\beta \ne \alpha }{\,f}_{\beta }=-{\lambda }^{{\prime} }{f}_{\alpha }(1-{f}_{0}-{f}_{\alpha })\). By setting \({\dot{f}}_{\alpha }=0\), an expression for \({f}_{\alpha }^{* }({f}_{0}^{* })\) can be obtained (Eq. (5)). First, summing over all α yields a closed equation for the uncommitted population’s fixed point, \({f}_{0}^{* }\) (Eq. (4)). This leads to the following set of equations:

$$1-{f}_{0}^{* }= -\frac{k(\lambda +{\lambda }^{{\prime} }){f}_{0}^{* }}{2{\lambda }^{{\prime} }}+\frac{k}{2}\\ +\frac{1}{2{\lambda }^{{\prime} }}\mathop{\sum }_{\alpha =1}^{k}\left[{r}_{\alpha }\pm \sqrt{{((\lambda +{\lambda }^{{\prime} }){f}_{0}^{* }-{r}_{\alpha }-{\lambda }^{{\prime} })}^{2}-4{\lambda }^{{\prime} }(1-\lambda ){f}_{0}^{* }{\pi }_{\alpha }}\right]$$
(4)
$${f}_{\alpha }^{* }=\frac{-((\lambda +{\lambda }^{{\prime} }){f}_{0}^{* }-{r}_{\alpha }-{\lambda }^{{\prime} })\pm \sqrt{{((\lambda +{\lambda }^{{\prime} }){f}_{0}^{* }-{r}_{\alpha }-{\lambda }^{{\prime} })}^{2}-4{\lambda }^{{\prime} }{f}_{0}^{* }(1-\lambda ){\pi }_{\alpha }}}{2{\lambda }^{{\prime} }}\,\,\alpha =1,\ldots ,k$$
(5)

where k is the number of sites. This expression provides as many equations as there are possible choices for the  ± sign in Eq. (5), which must be solved numerically. Not all sign combinations will yield a solution, but for those that do, we can determine the fixed points \({f}_{0}^{* }\) and subsequently compute the values of \({f}_{\alpha }^{* }\). The stability of these fixed points can by analyzed through a Linear Stability Analysis of Eq. (1), leading to the following expression for the elements of the Jacobian matrix:

$${J}_{\alpha \alpha }= \, \lambda ({f}_{0}^{* }-{f}_{\alpha }^{* })-{\lambda }^{{\prime} }(1-{f}_{0}^{* }-{f}_{\alpha }^{* })-(1-\lambda ){\pi }_{\alpha }-{r}_{\alpha }\\ {J}_{\alpha \beta }= -(1-\lambda ){\pi }_{\alpha }-{f}_{\alpha }^{* }({\lambda }^{{\prime} }+\lambda ).$$

Non-linear cross-inhibition

When incorporating the non-linear cross-inhibition responses (Eq. (2)), it is not feasible to derive a closed-form equation for \({f}_{0}^{* }\), as is possible in the linear case. Instead, one must numerically solve the system of equations where \({\dot{f}}_{\alpha }=0\). To find all fixed points, it is necessary to explore a sufficient number of points within the simplex (fα;  α = 1, …, k) as initial guesses for the numerical solver. The stability of these fixed points can then be confirmed by numerically integrating the system’s equations in the vicinity of the obtained solutions.

Master equation and stochastic simulation algorithm

We employed Gillespie’s algorithm38 to estimate the stationary probability distributions, enabling us to analyze the likelihood of selecting the best site. The transition rates that define the master equation are inferred from the system’s ODEs (Eq. (1)). The transition rates used in the Gillespie algorithm are:

$${T}_{\alpha }^{disc} = \, {n}_{0}(1-\lambda ){\pi }_{\alpha },\,{\mbox{discovery of option}}\,-\alpha ,\\ {T}_{\alpha }^{aban} = \, {n}_{\alpha }{r}_{\alpha },\,{\mbox{abandonment of option}}\,-\alpha ,\\ {T}_{\alpha }^{rec} = \, \frac{{n}_{0}\lambda {n}_{\alpha }}{N},\,{\mbox{recruitment of option}}\,-\alpha ,\\ {T}_{\alpha }^{c.i.} = \, {n}_{\alpha }{\lambda }^{{\prime} }\sum\limits_{\beta \ne \alpha }\sigma \left(\frac{{n}_{\beta }}{N}\right),\,{\mbox{cross-inhibition of option}}\,-\alpha .$$

Here, \({r}_{\alpha }={q}_{\alpha }^{-1}\) represents a simplified rate compared to the initial formulations of this model30,31. The mathematical derivation that leads from the master equation to the ODE system (for the version of the model without cross-inhibition) can be found in detail in ref. 31.

To estimate the stationary probability distributions, we run 104 simulations of the stochastic simulation algorithm, setting a maximum time t = 500 to ensure a stationary state is reached. We then count how many realizations settle on each of the possible stable fixed points. However, when there is only one fixed point and the stationary values of the population fractions \({f}_{1}^{* },{f}_{2}^{* }\) are very similar—such as in the case of the linear cross-inhibition model for λ < 0.2—we use a different approach to obtain a more reliable estimate. In this case, we run 103 longer simulations (t = 104) and collect 103 evenly spaced data points from the stationary state. Using these values \(({f}_{1}^{* },{f}_{2}^{* })\), we compute the probability of each option winning. For the stationary times, tss, we analyze 103 of these trajectories to determine when the system reaches the stationary plateau. We bin the temporal evolution into intervals of approximately 1 unit of the dimensionless time variable and compare the absolute difference between the population values at each time and the stationary average (computed using the system state at sufficiently long times) for each population α = 0, …, k. When the difference satisfies fα(t) − 〈fα < 0.1, we consider the population to have entered the stationary state at time t. An estimate of tss is then obtained by averaging the times at which this condition is met for each population (see Supplementary Fig. SF9).