Mapping the adaptive landscape of Batesian mimicry using 3D-printed stimuli

Taylor, Christopher H.; Watson, David James George; Skelhorn, John; Bell, Danny; Burdett, Simon; Codyre, Aoife; Cooley, Kathryn; Davies, James R.; Dawson, Joshua Joseph; D’Cruz, Tahiré; Gandhi, Samir Raj; Jackson, Hannah J.; Lowe, Rebecca; Ogilvie, Elizabeth; Pond, Alexandra Lei; Rees, Hallie; Richardson, Joseph; Sains, Joshua; Short, Francis; Brignell, Christopher; Davidson, Gabrielle L.; Rowland, Hannah M.; East, Mark; Goodridge, Ruth; Gilbert, Francis; Reader, Tom

doi:10.1038/s41586-025-09216-3

Download PDF

Article
Open access
Published: 02 July 2025

Mapping the adaptive landscape of Batesian mimicry using 3D-printed stimuli

Nature volume 644, pages 706–713 (2025)Cite this article

39k Accesses
6 Citations
237 Altmetric
Metrics details

Subjects

Abstract

In a classic example of adaptation, harmless Batesian mimics gain protection from predators through resemblance to one or more unpalatable models^1,2. Mimics vary greatly in accuracy, and explaining the persistence of inaccurate mimics is an ongoing challenge for evolutionary biologists^3,4. Empirical testing of existing hypotheses is constrained by the difficulty of assessing the fitness of phenotypes absent among extant species, leaving large parts of the adaptive landscape unexplored⁵—a problem affecting the study of the evolution of most complex traits. Here, to address this, we created mimetic phenotypes that occupy hypothetical areas of trait space by morphing between 3D images of real insects (flies and wasps), and tested the responses of real predators to high-resolution, full-colour 3D-printed reproductions of these phenotypes. We found that birds have an excellent ability to learn to discriminate among insects on the basis of subtle differences in appearance, but this ability is weaker for pattern and shape than for colour and size traits. We found that mimics gained no special protection from intermediate resemblance to multiple model phenotypes. However, discrimination ability was lower in some invertebrate predators (especially crab spiders and mantises), highlighting that the predator community is key to explaining the apparent inaccuracy of many mimics.

Chase-away evolution maintains imperfect mimicry in a brood parasite–host system despite rapid evolution of mimics

Article Open access 23 October 2023

Evolutionary arms race in ant-ant mimicry: Camponotus lateralis lags behind in mimicking color patterns and sizes of regional Crematogaster models

Article Open access 20 November 2025

Detecting signatures of selection on gene expression

Article 12 May 2022

Main

Batesian mimics gain protection when predators treat them as defended ‘models’ despite being palatable prey^1,2. As this deception of predators relies on a degree of perceived similarity, increasing resemblance should give a higher probability of misidentification. Yet mimics vary greatly in accuracy^3,6, raising the question of what stops ever-greater mimetic accuracy from evolving⁷. Numerous theoretical explanations⁸ have proposed functional trade-offs affecting mimetic appearance^9,10 and factors that might cause relaxed selection for accuracy⁴ such as predators’ inability to detect differences between mimics and models¹¹ or reduced motivation to discriminate¹². Many of these hypotheses are untested experimentally in realistic systems, and there is no consensus about the causes of variation in mimetic accuracy.

The expected outcomes of selection on visual adaptations depend on the specific characteristics of signallers and receivers in each study system¹³. Different visual receivers interpret the same colour patterns in different ways¹⁴; some features are more easily associated with a reward than others by a given receiver¹⁵; and systems with more stimulus types elicit more generalized responses^16,17. Thus, while experiments using abstract stimuli can illuminate general principles, we must test these principles in realistic systems. However, when we observe extant species, we see only small sections of the adaptive landscape, and miss an opportunity to examine the fitness of phenotypes that do not currently exist. One successful solution to this issue is to manipulate existing phenotypes, for example, by painting or covering real organisms to change their appearance¹⁸, or creating artificial replicas¹⁹, but the manipulations involved tend to be limited in range and realism^19,20,21.

To overcome these limitations, we generated stimuli combining the relevance and realism of working with real insects, full three-dimensional (3D) representation and the power to manipulate fine details of visual phenotypes. Hoverflies (Syrphidae) are a classic study system to provide reference points for our stimuli, with Batesian mimicry of wasps (Vespidae) varying across species from near-perfect, through approximate, to non-existent mimicry⁷ (Extended Data Table 1). We used 3D scans of real model (wasp) and mimic (hoverfly) species as starting points to define axes of variation within multivariate phenotypic space, and generated gradients of mimetic similarity by smoothly manipulating visual traits (shape, colour, pattern and size) along those axes. We then used additive manufacturing (3D printing) to turn these images into physical stimuli, enabling us to explore the fitness landscape of mimetic accuracy beyond the examples seen in nature.

To test key hypotheses about the existence of inaccurate Batesian mimics, we first tested the extent to which wild predators discriminate highly accurate, yet imperfect, mimics from their models. We then tested whether the presence of more than one model species affords increased protection to intermediate mimics (the multiple-models hypothesis¹²). We tested the relative signal salience²² of shape, colour, pattern and size by varying those traits independently and testing which are under the strongest selection for accuracy. Finally, we tested the eye-of-the-beholder hypothesis¹¹ by comparing the responses of a range of insectivores towards the same set of mimetic stimuli.

Discrimination ability

In the few tests of predator discrimination between real models and mimics, birds consistently distinguish images or specimens of wasps from any tested hoverfly, including some seemingly accurate Batesian mimics^23,24. However, using real specimens or images can never reveal the possible protection of a hypothetical mimic even closer to the model phenotype. The degree to which the fundamental limits of predator perception and cognition constrains decisions to attack mimicry complexes is therefore uncertain.

We generated stimuli (Fig. 1) along three axes, with each axis using the common wasp Vespula vulgaris to represent the aversive model at one end point, denoted V100 (100% based on V. vulgaris). Each axis used a different fly (Diptera) taxon as the other end point: the non-mimic Mesembrina meridiana (M100), intermediate mimic Syrphus ribesii (S100) and accurate mimic Chrysotoxum spp.³ (C100; Extended Data Table 1). We selected three intermediate points on each axis corresponding to equally spaced values of shape, colour, pattern and size. For example, S25/V75 indicates a stimulus of 25% based on S. ribesii and 75% based on V. vulgaris. To the human eye, the most accurate of these mimics appears considerably more like V. vulgaris than any existing hoverfly. The axis M. meridiana to V. vulgaris, viewable digitally in 3D, is provided in Supplementary Data 1.

**Fig. 1: Overview of the methods used to generate artificial mimetic stimuli.**

The focal predators were wild, free-living great tits (Parus major) in Madingley Wood, Cambridge, UK. Great tits are generalist predators of Hymenoptera and Diptera²⁵. We trained them to forage from feeding stations presenting arrays of small opaque dishes covered by openable lids that concealed a mealworm (Tenebrio molitor). We then fixed 3D stimuli to the lids to signal the reward status: half of the dishes displayed a non-mimetic fly stimulus (M100) and contained a mealworm, and half displayed a model wasp stimulus (V100) with no reward (Fig. 2a). The dishes were reset daily for a new session with a randomized arrangement of stimuli. Birds visited the feeding stations repeatedly during a session and could select among any unopened dishes at any time. We assumed that the birds attempted to maximize their rate of food consumption²⁶ to minimize opportunity costs and predation risk, starting with the dishes that they perceived most likely to be rewarding. We therefore estimated the level of protection of each stimulus according to how early or late in the sequence that dish was opened.

**Fig. 2: Discrimination ability of great tits.**

During the first day of training, the birds showed no bias towards either stimulus (47% for M100 and 53% for V100 among the first 15 dishes per feeder; binomial test, P = 0.38; n = 45), suggesting that previous encounters with real wasps or flies did not influence foraging choices (this was winter, when the birds had probably not encountered wasps for several months). After 3 weeks of training, the birds consistently targeted M100 first and either opened V100 dishes last or not at all (M100 88% V100 12% among first 15 dishes per feeder across the final 3 days of training; binomial test, P < 0.0001; n = 135; Extended Data Fig. 1a).

We next presented the full range of stimuli from all three axes, with approximately one-third being unrewarding V100 dishes, one-fifth being rewarding M100 dishes and the remainder being rewarding mimic dishes sampled from the other axis points (Fig. 2c and Supplementary Video 1). Birds generalized immediately from their learned preference towards the fly dishes, targeting the least-accurate stimuli first and moving on to the more accurate mimics when no other options were available (Extended Data Fig. 1b). With further opportunity for learning, the birds increased their discrimination between mimics and models (Extended Data Fig. 1b,c). After around 10 days, the responses stabilized and the birds discriminated all mimics from models, targeting V100 dishes significantly later than all other stimulus types, and avoiding the 75% wasp-like mimics more than those 50% and below (n = 3,071 presentations; Fig. 2b and Extended Data Table 2).

All mimetic stimuli were prioritized over wasp stimuli, with a trend of increasingly wasp-like stimuli receiving greater protection (Fig. 2b), despite all mimics being associated with the same reward. This aligns with signal detection theory, which predicts that, when multiple prey types are available, predators should respond more cautiously towards perceived signals that are more likely to have originated from a model²⁷. Moreover, the optimal response will be more cautious when models are more abundant, and/or when models carry a more severe cost²⁸. For our birds, the consequence of incorrectly targeting a model is a small opportunity cost (if, for example, another bird gets to a mealworm first): ethical and practical constraints mean that higher costs would be difficult to implement. In nature, wasps potentially impose a more severe cost (a sting), but also bring potential nutritional rewards, so the resulting protection would depend on quantifying these outcomes.

In a separate validation experiment, we investigated the extent to which our printed stimuli were treated like real insects by the birds. We again trained them to discriminate flies from wasps, then substituted half of the printed stimuli with dead specimens of real flies and wasps. Birds targeted the printed flies first (the learned reward in the training phase), followed by the real flies (novel but rewarding), with printed and real wasps (both unrewarding) receiving the same level of protection (Extended Data Fig. 2). Thus, the birds distinguished between at least some of the printed stimuli and their real-life equivalents—which, considering our other results (Fig. 2), is unsurprising—but generalized from printed stimuli to real insects, indicating that they recognize a commonality between the two.

Predictions of optimal discrimination levels are only meaningful if predators possess the sensory and cognitive abilities to achieve those levels. Our birds could detect and remember very subtle differences between mimic and model appearance, and used these differences to select rewarding over unrewarding prey. These results align with theoretical predictions that, given enough experience²⁹, signal receivers should have the ability to discriminate between tiny differences. Our findings provide a crucial baseline for studies of mimicry, rejecting the argument that inaccurate mimics are already sufficiently accurate to be indistinguishable from their models, given favourable conditions for the predator. This implies that, if a bird chooses to avoid inaccurate mimics, it may be driven more by its motivation^28,30 than a lack of ability. Moreover, certain factors might increase the level of sensory or cognitive difficulty of a discrimination task, such as separation of prey in time or space preventing side-by-side comparison³¹, prey movement³² or the existence of multiple model species, as we explore next.

Multiple models

Evidence from studies of Müllerian mimicry suggests that more complex prey communities cause predators to use broader generalization and a more conservative foraging strategy^16,33. The addition of a second model species to a mimicry system may increase the fitness of intermediate Batesian mimics, although these ‘jacks of all trades’ may not necessarily outperform perfect mimics of either model¹². Despite the clear existence of model diversity in nature³⁴, almost all experimental studies of Batesian mimicry use a single model phenotype (but see refs. ^35,36), and a key prediction from the multiple-models hypothesis remains untested regarding whether a mimic intermediate between two model phenotypes gains greater protection than one with an equivalent level of accuracy to one model, but further removed from the second.

To test this, we presented two distinct model (that is, unrewarding) stimuli to the birds: the common wasp V. vulgaris (V100) and a solitary wasp Argogorytes mystaceus (A100; Extended Data Table 1). The latter is also defended by a sting and displays black-and-yellow warning colours, but differs from the common wasp in appearance. We generated an axis including three stimuli intermediate between the two models (A75/V25, A50/V50 and A25/V75) and a further two stimuli at each end extrapolated along the same trajectory (A125/V−25, A150/V−50, A−25/V125 and A−50/V150). From the multiple-models hypothesis, the intermediate stimuli should receive greater protection than the extrapolated stimuli, despite equivalent similarity to a single model, due to their resemblance to the second model species. For example, A50/V50 and A−50/V150 are both 50 units from the model V100, but A50/V50 is much closer to the second model A100 (50 units) than A−50/V150 is (150 units). If there is any additive effect of mimicry to the two models, the intermediate A50/V50 should receive greater protection than the extrapolated A−50/V150.

We trained wild, free-living great tits and blue tits (Cyanistes caeruleus; the latter making a low proportion of foraging visits; Methods) to avoid model stimuli using the same approach as in the discrimination ability experiment. Six feeding stations were divided among two treatments: three with a single unrewarding model stimulus V. vulgaris V100 (1M treatment) and three including a second unrewarding model stimulus A. mystaceus A100 (2M treatment; Fig. 3a). The inclusion of the 1M treatment acted as a control, enabling us to compare directly whether the addition of a second model stimulus in 2M alters the protection received by any of the mimetic stimuli. Both treatments included a rewarding non-mimic fly stimulus M. meridiana (M100). Once birds were consistently targeting the M100 stimuli first (Extended Data Fig. 3a), we introduced the intermediate and extrapolated stimuli as rewarding (Batesian) mimics, alongside existing stimuli.

**Fig. 3: Testing the multiple-models hypothesis.**

At the start of the testing phase, preference for M100 stimuli was strong, but this weakened over approximately 10 days before reaching an asymptote with lower levels (but not an absence) of discrimination among stimuli (Extended Data Fig. 3b,c). Protection received by the various mimetic stimuli declined with increasing distance from the nearest model (n = 5,987 presentations; distance term ΔAICc = −28.9, d.f. = 1; Fig. 3b and Extended Data Table 3). This pattern was similar across both treatments (treatment × distance term ΔAICc = 1.4, d.f. = 1) and there was no increase in protection for intermediate as opposed to extrapolated stimuli (intermediate × distance term ΔAICc = 3.2, d.f. = 2).

The early decline in predator selectivity suggests that the birds were no longer as motivated to discriminate among the numerous stimulus types introduced in the testing phase. This contrasts with the results from our discrimination ability experiment, in which the birds increased selectivity early in the testing phase as they improved their recognition of stimuli. Differences between responses towards M100 and V100 stimuli (common to both experiments) were stronger in the discrimination ability experiment (Fig. 2) compared with in both treatments of the multiple-models experiment (Fig. 3c), despite the latter offering no other fly-like stimuli. Assuming comparable predator populations, this may indicate that the birds found the mimics in the multiple-models experiment sufficiently challenging to discriminate that they were less motivated to try to do so¹⁶. One challenge may have been the inclusion of extrapolated mimics, meaning that the model(s) were no longer at one extreme of the axis. More generally, as both axis end points were based on Hymenoptera, these mimics may have been perceived, on average, as more wasp-like than the stimuli in the discrimination ability experiment, including lower variation in certain key features associated with Hymenoptera such as the narrow waist and long antennae.

Despite the relatively low levels of selectivity, we still observed variation among stimuli in their level of protection. We found no evidence that a mimic gains extra protection by an intermediate resemblance to multiple models, compared with a mimic with equivalent accuracy to only a single model. However, our test is based on a single population of predators encountering all stimuli. In theory, multiple models could provide an additive benefit in cases where different predators have learned to avoid different models, for example, due to allopatry or separate phenologies^9,12. Further experiments incorporating geographical and/or temporal variation in mimetic communities would be required to test this.

Although we found no evidence for a selective advantage for jack-of-all-trades mimics, the addition of an extra model species is still important to the evolution of mimicry. In the two-model treatment, we see less variation among mimics in the level of protection received, because there are more phenotypes that are close to a model, and there are therefore more ways of achieving the same level of accuracy. In nature, aversive models frequently exist as part of a Müllerian mimicry ring of species with a shared warning signal³⁴. In that context, there may be sizeable regions of phenotypic space in which a Batesian mimic would achieve similar levels of mimetic similarity to one member of the mimicry ring or another, and potentially experience relaxed selection for further improvements in accuracy.

Trait salience

The above experiments varied all components of visual phenotype simultaneously, but some elements of appearance may have had more influence on predator behaviour than others^22,37. Typically, colour is highly salient to birds, taking precedence over (overshadowing) other visual traits such as shape when choosing prey^19,22. However, the informativeness of a trait is context-dependent and specific to the trait values under comparison³⁸. If a salient mimetic trait has already evolved close resemblance to the model, predators may discriminate using other traits³⁷. We must therefore consider trait values that are relevant to the study system when determining which traits are under the strongest selection. We sought to identify traits under the strongest selection for mimicry in the context of avian discrimination between wasps and flies. If some traits overshadow others, this could explain imperfection in otherwise conspicuous traits.

We generated experimental stimuli based on an axis from the non-mimetic fly Tachina fera to the wasp V. vulgaris. We varied four components of the appearance independently from one another, such that shape, colour, pattern and size could separately be fly-like (poor mimicry), wasp-like (perfect mimicry) or intermediate (good mimicry; equivalent to 50% in the discrimination ability experiment). We generated 31 mimetic phenotypes (Extended Data Table 4), with each mimic given combinations of poor and perfect traits, or good and perfect (but never poor and good, to limit the number of experimental subjects and presentations required).

To exert tighter control over the predator learning experience and facilitate the use of a larger number of stimulus types than in our wild-bird experiments, we conducted the experiment in the laboratory. We trained newly hatched chicks Gallus gallus domesticus in binary-choice trials to associate fly stimuli (poor in all traits) with a hidden mealworm reward and wasp stimuli (perfect in all traits) with no reward (Fig. 4a). Once the chicks showed a preference for the fly dish in at least 80% of presentations, we tested how they generalized their response to the various novel mimic (probe) stimuli in single presentations (Fig. 4b and Supplementary Video 2).

**Fig. 4: Chick behavioural response to multiple traits.**

The chicks did not reject any presented prey, but did show significantly greater latency to attack the unrewarded stimuli (wasp: median 1.28 s (1.139 lower quartile, 1.48 upper quartile), n = 545 presentations, 30 chicks) compared with the rewarded stimuli (fly: 0.86 s (0.76 lower quartile, 1.04 upper quartile), n = 544 presentations, 30 chicks; stimulus term from linear mixed model: 0.43 s, t = 24.5, P < 0.0001). Even hesitations of fractions of a second, as seen here, could determine prey capture versus escape in a natural context³⁹, and therefore influence the selective pressures experienced³². The latency to attack was also significantly affected by stimulus type among the novel probe stimuli (Fig. 4c, Extended Data Fig. 4 and Supplementary Data 2). We tested both additive and nested models to predict chick response and found strong support for a positive association between the degree of colour mimicry and the latency to attack (n = 910 presentations, 30 chicks; colour appears as an additive effect in all five top-ranked models with ΔAICc < 2; Supplementary Data 2). We also found some support for an influence of stimulus size on chick behaviour (size appears as an additive effect in three of the five top-ranked models with ΔAICc < 2, and a further one of five as a nested effect when colour is good or perfect; Supplementary Data 2).

Our results reaffirm that colour should be under strong selection by birds for mimetic accuracy, being more salient than other visual traits²². Hoverfly and wasp colours are typically distinct enough to be theoretically discriminated by birds, but many are highly similar and could, under natural conditions, be indistinguishable⁴⁰. In those cases, our results would predict selection to act on size as the next most salient visual trait. Our experiment demonstrates a strong propensity of chicks to respond to subtle size differences, even though our stimuli only differed by 2 mm at most (body lengths: wasp 12 mm, fly 14 mm). The ability to recognize Batesian mimics from their size is known in some garden birds⁴¹, albeit involving larger differences. Pattern and shape appear to have weaker effects on predator behaviour, implying that these traits should experience relaxed selection²², which would explain some elements of inaccurate appearance in mimics.

Invertebrate predators

Many studies of Batesian mimicry, as with the experiments above, use birds as focal predators^{16,18,22,23,41}. The eye-of-the-beholder hypothesis^11,23 states that mimics that appear inaccurate to one receiver might be perceived as more accurate to another. Consequently, the selective landscape for mimetic phenotypes depends on the suite of predators encountering a given mimic—the multiple-predators hypothesis⁴². Despite being important predators of many mimetic prey⁴³, and likely to attend to different aspects of a mimetic signal compared with vertebrates⁴², invertebrates are under-represented as predators in mimicry studies. Invertebrates including praying mantises⁴⁴, jumping spiders⁴⁵ and crab spiders⁴⁶ can learn to avoid aposematic prey, and will generalize this avoidance to mimics, but other studies of invertebrate taxa have shown limited visual discrimination⁴⁷. There is little evidence about how discerning these predators might be among mimics of varying accuracy (although see ref. ⁴⁸) and none have compared the responses of multiple taxa to the same stimuli. Determining the extent to which perceptions of mimetic accuracy vary among invertebrates and differ from those of vertebrates is therefore of interest.

We assessed the ability of several invertebrates to discriminate between fly and wasp stimuli: praying mantises (Mantidae), jumping spiders (Phidippus audax) and crab spiders (Synema globosum). Owing to difficulties in training these predators to associate inedible stimuli with a separate reward, we trained them to associate the wasp stimuli V100 with a negative experience, leading to the same broad outcome of training: the spiders associated a non-mimetic fly (M100) with a more-positive outcome than V100. We then tested their response (Extended Data Table 6) towards stimuli from an axis running from M100 to V100 (as used in the discrimination ability experiment), with the aversive experience repeated in the case of V100 to reinforce the learning (Supplementary Video 3).

All three invertebrate taxa discriminated among the stimuli based on appearance (phenotype term in mantis models: n = 40 presentations, 8 individuals, ΔAICc = −36.4, d.f. = 4 (Fig. 5a); jumping spider: n = 57 presentations, 9 individuals, ΔAICc = −26.8, d.f. = 4 (Fig. 5b); crab spider: n = 50 presentations, 50 individuals, ΔAICc = −11.7, d.f. = 4 (Fig. 5c); Extended Data Table 5). There was a sharp increase in protection from mantises between M75/V25 and M50/V50, after which point protection reached similar levels to the aversive V100 phenotype. Jumping spiders showed a similar pattern but with a more gradual increase in protection, only reaching wasp-like levels of protection at M25/V75. The trends for the crab spiders are less clear-cut, but suggest similar levels of protection for all stimuli of M50/V50 and above.

**Fig. 5: Levels of protection received by different stimulus types resulting from invertebrate predators’ behaviour.**

This demonstrates that, despite the overwhelming focus on vertebrates (especially birds) as predators in mimetic systems, invertebrates can also exert selective pressure for mimicry. Mimics sharing sufficient similarity with the models were afforded a similar level of protection to the models themselves, but the similarity required was considerably lower than that observed in the discrimination ability experiment using great tits. Moreover, among the three invertebrate taxa there were further differences in the level of mimicry required for protection.

Comparisons among such taxonomically diverse predators are inevitably limited by the need to tailor experiments to the behaviour and physiology of the predator in question. The mimetic phenotypes acceptable to a predator depend, among other factors, on the benefit of attacking a mimic and the cost of attacking a model¹², which are impossible to standardize fully across taxa that vary, for example, in nutritional needs and foraging strategy. Even focusing only on the invertebrates, which all received the same aversive stimulus, we see variation in the protection received by M50/V50, which was accurate enough to receive the same protection as a wasp from the mantises and crab spiders, but not the jumping spiders (Fig. 5). A possible explanation is the varying visual abilities of the predators in question: they all use vision to hunt, but praying mantises have limited colour vision⁴⁹ in comparison to jumping spiders, which also have excellent acuity⁵⁰. Even if all of the predators were to detect the same visual information about the stimuli, their varying behaviour could be explained by differences in foraging strategy, such as levels of risk-aversion. Regardless of the underlying mechanism, the identity of the predator is clearly a key factor in determining the strength of selection exerted on visual features of mimics⁴².

Discussion

Testing the responses of predators to 3D hoverfly-like stimuli enabled us to examine directly the protection received by various mimetic phenotypes. Our pipeline for generating intermediate and extrapolated phenotypes enabled us to evaluate areas of the adaptive landscape describing the protection received by Batesian mimics. Specifically, we have (1) generated highly accurate, but not perfect, mimics to show that such mimics still undergo selection for accuracy by a sufficiently motivated avian predator; (2) generated intermediates between two model species to show that jack-of-all-trades mimics receive no increase in protection compared with ones with similar accuracy to a single model; and (3) varied four visual traits independently to show that shape and pattern may be under weaker selection for accuracy than size and, in particular, colour.

Controlled testing of these hypotheses would not be possible using only real insect specimens. Our 3D-printed stimuli are good but not perfect visual replicas of real insects, inevitably being limited by available technology in aspects such as wing transparency and movement. Similar to most mimicry research^4,14,23,36, we focus here on explaining the visual components of mimicry. Other sensory modalities such as olfaction⁵¹ and sound/vibration⁵² may also contribute to predator discrimination, especially in the case of invertebrate predators⁴², and could be explored in future work using a similar framework. Although our printed stimuli are not perceived by birds as identical to insects, birds will generalize from them to real specimens (Extended Data Fig. 2), demonstrating ecological relevance. Furthermore, both the realism of our stimuli and the degree to which we can manipulate their traits represent a step change compared to previous studies using artificial prey^19,20.

We have conducted the most extensive comparison yet of invertebrate responses to the same mimetic stimuli, testing the eye-of-the-beholder hypothesis^11,23. It is well known that varying visual systems can lead different predators to receive different information from the same signal¹³, and here we have shown this variation can explain the persistence of inaccurate mimicry under selection from some predators. Among prey experiencing attacks from predators such as praying mantises, a wide range of only moderately accurate mimetic phenotypes will receive protection through their mimicry. Mimics attacked by more discerning predators will experience selection for greater accuracy whereas, in mimics exposed to multiple predators, selection will depend on the combination of different levels of discernment, and/or traits used to identify models and mimics by different predators⁴².

Among the numerous proposed explanations for inaccurate mimicry, a key distinction lies between those suggesting an advantage to inaccurate mimics, and others predicting relaxed selection over a range of moderately accurate phenotypes⁸. We find no evidence in our experiments for a selective advantage to inaccurate mimics. Instead, we find that some traits, and prey of some predators, are likely to experience relaxed selection for visual mimicry.

The persistence of inaccurate mimicry in nature is a classic problem in evolutionary biology, notable for an abundance of theory much of which has been challenging to test experimentally. Our use of cutting-edge 3D technology enabled us to explore the selective pressures on mimetic adaptation in great depth, revealing the incredible discriminatory ability of an insectivorous bird, as well as how model community, trait salience and predator species limit the degree of discrimination in other contexts. In allowing such fine manipulation of visual phenotypes, our approach brings flexibility to the study of mimicry, and more widely to explore the adaptive landscapes for other complex morphological traits.

Methods

Production of artificial stimuli

Overview

To explore predator responses to realistic, but in some cases hypothetical, stimuli, we produced 3D-printed plastic insect replicas. Some stimuli were based on real insect specimens and were matched as closely as possible in shape, colour, pattern and size to the assigned insect. Other stimuli were produced by interpolating along a smooth gradient or axis running between two real specimens.

Specimens

Experimental stimuli were based on real wasp (Hymenoptera) and fly (Diptera) taxa chosen to represent different levels of mimetic accuracy (Extended Data Table 1). To generate intraspecific variation, in the wild-bird experiments, we used three different individuals from each taxon to produce separate stimuli.

Insect specimens were collected between June 2020 and August 2021 from various locations in England using a hand net. Specimens were euthanized by freezing at −18 °C for approximately 30 min. They were then pinned through the thorax and positioned into a natural-looking posture before drying for 6–24 h.

Photogrammetry

3D digital images of the insect specimens were obtained by photogrammetry, using a protocol adapted from a previous study⁵³. Specimens were suspended, with the anterior uppermost, on a motorized turntable (Genie Mini II; Manfrotto, Cassola, Italy), positioned against a white background and lit indirectly using two LED panel lights (22 W, 5600 K; Pixapro). They were photographed using a DSLR camera (Canon EOS 600D) and macro lens (Tamron SP 90 mm) with F20, 1/6 s exposure, ISO400. Each specimen was photographed from 36 different angles—three vertical camera positions at each of 12 equally spaced turntable orientations. Wings were removed and photographed separately (single photo at a perpendicular angle), because otherwise their positioning on the body prevented important details of the abdominal pattern and shape from being accurately reconstructed.

A 3D shape file (mesh) was built from the set of 36 photographs using the software 3DSOM⁵⁴, which uses the outline of the specimen in each photograph to carve out a 3D shape. The colour information from the photographs was then projected onto this 3D shape to give the corresponding colour pattern.

3D image processing

Except where noted, 3D images were edited using Blender⁵⁵. Using the images obtained from photogrammetry as starting points, we constructed axes of similarity between pairs of real insects through 3D morphological space. We defined axes of phenotypic variation along which the traits of shape, colour, pattern and size varied smoothly from one image to the other, and generated phenotypes by picking either intermediate or (in the multiple-models experiment) extrapolated points along those axes. The four traits were varied in parallel with each other, except in the trait salience experiment, in which they were varied independently. Details of the specimen images used as axis end points are given under each experiment heading below.

Owing to difficulties in both processing and printing of thin and elongated structures, legs and antennae were removed digitally from the meshes, to be added back at a later step in more simplified form. Wings were treated in a similar manner, having already been removed from specimens before photogrammetry.

Shape deformations were carried out using the software Deformetrica⁵⁶, which uses control points based large deformation diffeomorphic metric mapping. A single simplified template was projected onto both end points, such that each retained its shape features but remapped onto new vertices that now had a direct one-to-one correspondence between the two meshes. We then calculated the deformation of 3D space required to transform one shape into the other and, using this, calculated intermediate shapes along the same axis.

Pattern manipulation was performed using custom scripts in R (v.4.3.0)⁵⁷. Pattern data were mapped onto the reconstructed meshes for the two end points, and vertices were separated into two colour segments using k-means clustering (k = 2) of RGB colour values. A signed distance map was calculated for each end-point pattern, whereby all vertices were assigned a value being the shortest possible edge distance to a vertex of the opposing colour. We created new intermediate distance maps by taking weighted averages of the end-point distance maps, and then reverse-engineered them into binary colour patterns by assigning all positive vertices to one segment and negative vertices to the other.

Each segment was assigned a single RGB colour value calculated as the median of the original colour data from the vertices included in that segment. Colours for intermediate patterns were calculated as linear interpolations between the corresponding segments in the end points. Ultraviolet reflection was ignored because there is no evidence of such colour components in wasp or hoverfly patterns⁴⁰.

Owing to limited resolution at the printing stage, legs and antennae for all meshes were given the same standardized shape and a uniform colour. The shape was based around a cylinder, with diameter 0.6 mm for legs and 0.8 mm for antennae (thinner antennae were found to be too fragile after printing). In the case of legs, articulations were added to separate the coxa, femur, tibia and tarsus, and, for antennae, the cylinder was bent into a gentle curve. Colour was taken from whichever of the two body colour segments most closely matched the majority leg colour of real specimens. Antennal length was matched against distances measured from the original 3D digital image, with intermediates calculated by linear interpolation.

Wings were created with a flat shape, 0.4 mm thickness, based on the outline taken from photographs, which corresponded to shapes as they are typically seen when the insect is at rest. In contrast to Diptera, V. vulgaris and A. mystaceus have two pairs of wings but, at rest, the hindwings are hidden owing to overlap with the larger forewings (the latter being folded in the case of V. vulgaris). Wing shapes for intermediate meshes were calculated using the same deformation method as for the bodies. As our printing method was unable to recreate transparent materials, all wings were assigned a uniform colour value of 50% grey. This colour matched that of the bases to which the insects were attached (see below).

The various components (body, legs, antennae and wings) were combined digitally to produce a mesh of the whole insect and finally scaled to match the body length of the relevant end point, or a value calculated by linear interpolation for intermediates. A base was added to provide an attachment point for the object as a whole, as well as improving the structural integrity of the legs. This base was circular as viewed from above, with a narrow post extending up into the ventral side of the thorax.

An example axis (M. meridiana to V. vulgaris), viewable in 3D, is provided in Supplementary Data 1.

Additive manufacturing

We printed physical 3D representations of these digital insects on a HP Jet Fusion 580 machine using polyamide 12 powder (CB PA12) and colour cosmetic settings. Stimuli were printed at Matsuura Machinery for the discrimination ability and invertebrate predators experiments, and at the University of Nottingham for the rest. Stimuli were then given VaporFuse Surfacing treatment in a DyeMansion Powerfuse S, which created a less grainy, slightly glossier finish.

Nomenclature

We refer to stimuli in the text according to the initial letter of the genus of the axis end points, and the percentages by which each was weighted when creating any intermediate form. For example, C100 indicates a stimulus based 100% on Chrysotoxum, and M25/V75 indicates an intermediate with M. meridiana weighted by 25% and V. vulgaris weighted by 75%. In the multiple-models experiment, some stimuli were created by extrapolating beyond the range of the two end points, using weighted averages greater than 100% or below 0%, for example, A150/V−50.

Ethical approval

The Trait Salience experiment was approved by Newcastle University AWERB committee (project ID 966). Wild-bird experiments (discrimination ability and multiple models) were approved by AWERB committees at University of Nottingham (project ID 260) and University of Cambridge (ref. NR2022/60).

Wild-bird experiments

Field site and study organisms

Fieldwork was conducted in Madingley Wood, Cambridgeshire, UK (52.217° N, 0.049° E), a deciduous woodland composed primarily of broadleaf hardwood trees. The wood has a resident population of great tits, some of which, as part of other projects, have been fitted with passive integrated transponder (PIT) tags. Tags of birds involved in this study were fitted between July 2018 and October 2022 under licence from the special methods of the BTO projects 1120 and 1121 held by HMR. Birds included both males and females and were a mix of ages from first-year juveniles upwards.

Feeding stations

Feeding stations were placed at intervals within the wood, positioned close to dense vegetation to provide cover for small birds, and separated from each other by at least 80 m. The feeding stations consisted of a 0.75 × 0.75 m wooden board on which a 7 × 7 array of 30 mm diameter Petri dishes was fixed. The board was placed on top of a 1.4 m wooden post and covered with a 0.75 × 0.75 × 0.75 m cage made from 7 mm square galvanized wire mesh. On one side of the cage, approximately 0.5 m above the bottom of the cage, a 30 mm entrance hole allowed small birds to enter past a data logger antenna. The antenna was linked to a data logger (Francis Scientific Instruments), which logged PIT tags of any tagged birds entering. A single horizontal perch ran across the cage at the level of the entrance, and a further six perches were placed approximately 100 mm above the surface of the board, running between rows of Petri dishes. A motion-sensitive camera trap (CY70, Ceyomur) was placed above the top of the cage pointing downwards, such that the cage entrance and all Petri dishes were in view.

An example video showing two great tit individuals interacting with a feeding station is provided in Supplementary Video 1.

Discrimination ability and multiple-model experiments

Two main experiments were conducted at this field site using similar methodologies, along with a third generalization test: the discrimination ability experiment ran from December 2021 to May 2022, multiple models from October 2022 to April 2023 and the generalization test from October to December 2023. These experiments differed in timings and the stimuli used as explained in the relevant sections below, and a few details relating to sample sizes as follows.

In the discrimination ability experiment, five feeding stations were used; two of those did not receive enough successful feeding events and were dropped from the study before the testing phase, leaving three feeders. In the multiple-models experiment, a sixth feeder was added and all were in use throughout the experiment. In the generalization test, four feeders were used, three of which had been used previously and one placed in a new location within the wood.

Ten tagged individual great tits during the testing phase of the discrimination ability experiment, and eight tagged individuals in the multiple-models experiment, made more than 80 visits each, including five individuals present in both experiments. An unknown number of untagged individuals also visited in both cases; trapping records indicate that approximately 71% of the population were tagged in November 2021 and 51% in January 2023. In the discrimination ability experiment, tagged birds directed most of their visits to a single feeder (median 90%, lower quartile 78%, upper quartile 95%). During the multiple-models experiment, fidelity to feeder was weaker (median 51%, lower quartile 42%, upper quartile 64%) but fidelity to treatment was high (median 81%, lower quartile 66%, upper quartile 92%). In the generalization test, only one tagged individual visited the feeders, with 68% of its visits to a single feeder. No tagging had been conducted that year, so many tagged birds had probably died or dispersed.

Stimuli: discrimination ability experiment

Stimuli were drawn from three axes, all ending at V. vulgaris and starting from fly taxa with varying levels of mimetic accuracy: M. meridiana, S. ribesii and Chrysotoxum (Extended Data Table 1). Each axis consisted of the two end points and three intermediates at 25%, 50% and 75% similarity to V. vulgaris.

In the training phase, we used 15 rewarding fly (M100) and 15 unrewarding wasp (V100) stimuli (with 19 dishes left unused). In the testing phase, we used 17 unrewarding V100 stimuli and 32 rewarding stimuli, the latter including 10 M100 stimuli as experienced in the training phase, as well as two of each of 11 new phenotypes. New phenotypes were three intermediates from the M. meridiana axis (M75/V25, M50/V50, M25/V75), S. ribesii (S100) and its three intermediates (S75/V25, S50/V50, S25/V75), and Chrysotoxum (C100) and its three intermediates (C75/V25, C50/V50, C25/V25).

Stimuli: multiple-models experiment

Stimuli were drawn from an axis running from A. mystaceus to V. vulgaris, representing two model species and related phenotypes. In addition to the end points, each axis included three intermediates (25%, 50% and 75%) and four extrapolations, two beyond each end point at distances equivalent to the 25% and 50% intermediates. A separate non-mimetic stimulus of M. meridiana M100 was used, with no intermediates.

Feeders were assigned to either one model (1M) or two model (2M) treatments, with treatments spatially grouped within the study site to reduce the chances of an individual bird that visited multiple feeders experiencing both treatments. In the training phase, we used 25 rewarding fly (M100) and 24 unrewarding wasp stimuli, the latter being either exclusively V100 (1M treatment) or 12 × V100 and 12 × A100 (2M treatment; Fig. 3a).

In the testing phase, we used 20 unrewarding wasp stimuli, either exclusively V100 (1M) or 10 × V100 and 10 × A100 (2M), and 29 rewarding stimuli of which 10 were M100 and the remaining 19 were drawn in equal numbers (with rounding) from the intermediate and extrapolated phenotypes of the A. mystaceus—V. vulgaris axis A150/V−50, A125/V−25, A75/V25, A50/V50, A25/V75, A−25/V125, A−50/V150 (Fig. 3c).

Stimuli: generalization test

Here we tested whether the birds would generalize their preference for flies over wasps, learned from the printed stimuli, to the real insects. In the training phase, we used 12 rewarding fly (M100) and 12 unrewarding wasp (V100) stimuli (with 25 dishes left unused). In the testing phase, we swapped half of the printed stimuli for dead, real specimens of the same fly and wasp species, glued to circular bases identical to those used for the printed stimuli. The testing phase was limited to 5 days to focus on the birds’ initial responses to the real specimens and minimize their opportunity to refine their learning. The short duration also minimized damage and decay of the specimens.

Habituation phase (wild birds)

Feeders were first provided with open Petri dishes which contained a single mealworm per dish, as well as peanuts placed on the board in between dishes (only provided during initial stages and when visitation rates were low, to encourage birds to visit). Food was refilled every 2–3 days, and the whole feeding apparatus was sterilized using 70% ethanol spray every 2 weeks. After 3 days, transparent lids were placed onto the Petri dishes so that mealworms were visible, but only accessible if the lids were opened. Over the course of four weeks, visiting great tits learned to open the lids by flipping them off using their beaks. Petri dishes and lids were then painted so that the contents were not visible until the lids were flipped. Great tits continued to search for food by flipping off the lids to obtain the mealworms and, in most cases, all 49 mealworms had been consumed after 2 days. Other bird species and small mammals were seen visiting the feeding stations to feed on the peanuts, but rarely opened lids. In the multiple-models experiment, from 12,331 lids that were opened, 401 were by blue tits C. caeruleus, which were included in analysis, considering their close relatedness with great tits. Mice opened 59 lids which were excluded from analysis. Only great tits opened lids in the discrimination ability experiment and the generalization test.

Training phase (wild birds)

After the habituation phase, a 3D-printed stimulus was attached to the lid of each Petri dish. To train the birds to avoid the wasp stimuli (V100 and, for multiple models 2M treatment, A100), no food was provided in the corresponding dishes, and mealworms were placed only in the fly dishes (M100). Every 1–2 days we began a new session by replacing all of the lids in a new configuration, randomized with respect to board position, and restocking the relevant dishes with mealworms. The training phase continued for 3 weeks for the discrimination ability experiment, and 4 weeks for the generalization test. The training phase of the multiple-models experiment continued for 6 weeks, which included a gap of 1 week (Extended Data Fig. 3a) when cold weather forced a pause in the experiment because heavy frost made the dishes unopenable.

Testing phase (wild birds)

The testing phase followed the same methodology as the training phase, but introducing a wider range of stimuli in addition to those on which the birds had been trained (see the ‘Stimuli: discrimination ability experiment’ and ‘Stimuli: multiple-models experiment’ sections). All of the newly introduced stimuli were rewarded, representing mimics with varying levels of accuracy. This phase lasted 5 weeks (10 weeks for the multiple-models experiment).

Trait salience experiment

Study organisms and housing (chicks)

Domestic chicks (G. g. domesticus; P.D. Hook Hatcheries) were acquired immediately after hatching and housed in a laboratory at Newcastle University. Chicks (not sexed) were housed communally in two non-concurrent batches of 36 in a floor pen measuring approximately 2 m² with access to food (HPS Starter Crumb, Special Diets Services) and water ad libitum. The room was kept at 25 °C and under a 14 h–10 h light–dark cycle. The number of chicks was chosen with the aim of a sample size of 10–20 presentations per stimulus type, allowing for some exclusions due to failure to meet training criteria (see below).

Experimental arena (chicks)

The experiments took place in an arena measuring 140 × 70 × 40 cm and divided into three sections of lengths 25, 90 and 25 cm, separated by mesh barriers such that each section was visible from the others. The first section formed a buddy area to house two buddy chicks (from a stock of eight, rotated every hour) during all sessions. Buddy chicks were never used for experimental testing, but instead ensured that experimental chicks were always able to see and hear conspecifics, to reduce stress. The largest section of the arena was the experimental area, which included a removable board on which grey opaque food dishes, with removable lids, were mounted. The final section was a holding area in which chicks were placed during 30 s gaps between presentations.

An example video showing a chick approaching stimuli in the experimental arena is provided in Supplementary Video 2.

Stimuli (trait salience)

Stimuli were based on the non-mimic T. fera and the model V. vulgaris. Each of four traits—shape, colour, pattern and size—was varied independently to different levels of mimicry, being poor (matching T. fera), good (50% intermediate) or perfect (matching V. vulgaris). Stimuli were created in all possible combinations of poor and perfect traits, or good with perfect traits (but never poor and good traits in the same stimulus), resulting in 31 different trait combinations (Extended Data Table 4).

Habituation phase (trait salience)

On the first day after arrival in the laboratory, chicks received six 2 min trials in the experimental area, foraging from eight open dishes containing mealworms T. molitor. Chicks were first grouped in threes, then pairs, then individually (two trials each). Before the last three sessions on day one, and all of the following sessions, chicks were food-deprived for 60 min to ensure motivation to forage.

Over the course of the following 6 days, chicks received one trial each day during which they received 16 presentations of two dishes, each containing a mealworm. During a presentation, chicks were placed in the main arena and had up to 30 s to obtain a mealworm. Chicks were removed before being able to consume the second mealworm and placed in the holding area for 30 s in preparation for the next presentation. Each day, opaque lids were placed increasingly covering the dishes until the lids were fully on and the mealworm completely hidden, teaching chicks to lift off a lid to obtain a mealworm.

Training phase (trait salience)

Chicks were each given a further series of trials during which they learned to discriminate fly from wasp stimuli through paired choices. Chicks were presented with the same two dishes as in habituation, but with one bearing a 3D-printed model of T. fera (fly, poor in every trait) and a mealworm inside, and the other with a model of V. vulgaris (wasp, perfect in every trait) and no reward. After the chick opened one of the two lids (or 30 s elapsed, whichever happened first), it was moved back into the holding area to prevent it accessing the other dish. The chicks then remained in the holding area for 30 s before the next presentation. Each day, the chicks received 1 trial of 16 presentations.

After 5 days of training the first batch of chicks, it was noted that some individuals showed a bias towards one of the two dish positions (left or right, not consistent across chicks), regardless of the stimulus. To reduce this stereotyped behaviour and encourage learning, we subsequently varied dish positions among presentations, placing dishes in two out of four possible positions along a line perpendicular to the chicks’ starting position.

Trials continued until chicks chose the fly dish on at least 13 of the 16 presentations, which took 7–11 days. We excluded 12 chicks that did not reach this learning threshold from further testing and analysis. We note that, as a result, our conclusions apply only to the subset of the chicks involved in the testing phase. The presence of some individual predators which are less selective does not prevent the majority, which do discriminate among prey types, from exerting selective pressure on mimetic phenotypes.

Testing phase (trait salience)

Chicks then received up to four further daily trials (some chicks that took longer to complete the training phase spent less time in the testing phase) testing their response to intermediate stimuli. The structure of trials was identical to the training phase, except that birds were given only one stimulus in each presentation. In each trial, chicks received six presentations of a Petri dish containing a mealworm and topped with the same fly stimuli used in training, six with no reward and topped with the wasp stimuli used in training and four further probe presentations. The probe stimuli were dishes containing a mealworm and topped with a novel insect, drawn at random from 31 possible trait combinations (Extended Data Table 4). Note that possible probe stimuli included one identical to the unrewarding wasp stimulus (perfect in all traits) but associated with a mealworm reward, so acting as a perfect Batesian mimic.

Chicks opened all dishes in the testing phase, without exception. The latency to attack was measured from the moment the chick was released into the arena to its first peck of the dish or lids. Given the speed at which chicks approached the dish (median, 1.1 s), timings were taken from video recordings slowed to 0.3× speed using the BORIS software package⁵⁸ to improve accuracy. Experimenters were not blind to stimulus type during this process.

Invertebrate predators experiment

Study organisms and housing (invertebrates)

Praying mantises of three species (Rhombodera kirbyi (n = 5, fourth instar to adult), Polyspilota aeruginosa (n = 1, subadult) and Pseudoxyops perpulchra (n = 2, third instar), all unsexed; BugzUK and LDW bugs) and jumping spiders (adult male and female P. audax obtained from Jumping Spiders Web) were housed individually in transparent plastic boxes (19 × 13 × 8 cm) in a laboratory at University of Nottingham. The room was kept at 26 °C and under a 12 h–12 h light–dark cycle. They were fed crickets or mealworms twice weekly, with all trials conducted 30 h after feeding.

We collected crab spiders (S. globosum, adult male and female) that were sitting in wait for prey on flowers (where they hunt for pollinating insects) around the Quinta de São Pedro field research station (38.568° N, 9.193° W) and surrounding areas of Sobreda, Portugal. Individuals found with recently killed prey were not included. Spiders were kept at the Quinta de São Pedro research station, and individually housed in transparent plastic universal tubes. Spiders were kept, unfed, for 48 h until use, but note that the median time since the last meal will have been longer. The room was kept at 22 °C, with no artificial light–dark cycle.

Experimental arena (invertebrates)

Mantis and jumping spider trials were performed inside an opaque plastic box (19 × 13 × 8 cm). A fishing line was fed through two small holes at either side of the box, with one end attached to a counterweight maintaining tension and the other end attached to a bobbin. The bobbin was spun by a motor, programmed with a microcontroller board (Arduino) to rotate in a randomized pattern (1–2 s clockwise or anticlockwise, 0–1 s pause, then repeated in the opposite direction). This moved stimuli left and right in rapid, jerky motions and encouraged striking⁵⁹. Stimuli were suspended by a fine steel wire loop from the fishing line, allowing them to dangle and move in three dimensions.

Crab spider trials used a similar arrangement of equipment but with an arena that was larger (69 × 38 × 41 cm) and included the addition of a single purple milk thistle (Galactites tomentosus) to provide a perch for the spiders. The fishing line to which stimuli were attached entered through a hole in the lid of the arena as opposed to the side, causing stimuli to move vertically towards and away from the flower, as opposed to left and right.

Example video clips showing the different predators being presented with stimuli in their respective arenas are provided in Supplementary Video 3.

Stimuli (invertebrates)

Stimuli were drawn from an axis running from M. meridiana (fly; M100) to V. vulgaris (wasp; V100) with three intermediates: M75/V25, M50/V50 and M25/V75. This axis matches one of the three axes used in the discrimination ability experiment. Stimuli were removed from their bases as the presentation method involved hanging down on a wire rather than resting on top of a lid.

Training phase (invertebrates)

Praying mantises (n = 8) and jumping spiders (n = 8) each underwent six aversive conditioning trials on separate days. In the first trial, the stimulus was randomly allocated (M100 or V100) for each individual then, in subsequent trials, the stimulus was alternated. After being placed into the arena, animals were given 1 min to acclimatize before the stimulus was introduced. All mantises attacked the stimuli within 10 min, and were immediately punished after attacking wasp stimuli (V100) by being prodded firmly on the thorax with a separate wasp stimulus attached to the end of a thin metal rod. Subjects appeared to be appropriately threatened by this punishment, responding by moving away from the rod. Jumping spiders did not always attack, and were punished (in the same way as the mantises) at the end of trials involving a wasp stimulus, regardless of whether the spider had attacked the stimulus or not. Fly stimuli (M100) were associated with no reward or cost.

Training for crab spiders was performed using a condensed protocol as it was not possible to maintain the wild-caught spiders in the laboratory for long periods. The spiders (n = 150) did not undergo trials with presentations of wasp or fly stimuli, but simply received the punishment without any previous associated stimulus or behaviour. However, this still provided an opportunity to associate the negative experience with the wasp stimulus owing to its use in the ‘punishment’ process itself.

Testing phase (invertebrates)

Praying mantises and jumping spiders received a further nine trials using the same procedures as the training phase. Five probe stimuli were presented consisting of each of the five points along the axis in random order, alternating with four reinforcement trials (two M100 and two V100). All attacks on (mantises) or encounters with (spiders) wasp stimuli were punished as before. Owing to restricted time in captivity, each crab spider was presented once with a single stimulus, selected from the five axis points at random.

As in the training, mantises attacked all stimuli, and the latency to attack was measured from when the motor was switched on to the mantis first striking the stimulus. Spiders rarely attacked the stimulus (P. audax 11% of trials, S. globosum 17% of trials); thus, using latency to attack as the primary measure of behaviour would provide poor resolution. They did, however, display a range of positive (such as orientation towards the stimulus, approach) and negative (for example, retreat, hide) behaviours in response to stimuli; a full list of the observed behaviours is shown in Extended Data Table 6. Instances of these behaviours were recorded over the full trial period (P. audax, 5 min; S. globosum, 3 min). Experimenters were not blinded to the stimulus type during the laboratory trial.

Statistical analysis

All analysis was carried out in R (v.4.3.0)⁵⁷. We used generalized linear models and generalized linear mixed models implemented in the package lme4 (ref. ⁶⁰). In all cases, model fit was assessed visually for normality of residuals and homoscedasticity using residual plots. From a defined set of candidate models, the most parsimonious was selected based on lowest AICc values, with ties (a difference in AICc of less than two) broken by choosing the model with the fewest degrees of freedom⁶¹.

Wild-bird experiments

Within each session and feeder, we determined the order of dishes being opened on the basis of video data, with any left unopened placed at the end of the sequence. Those coding the videos were not blinded to the stimulus identity during this process. We converted this sequence to a set of protection values from 0 to 1, corresponding to the first and last dishes of the sequence respectively. Thus, 0 can be considered to be the least protected as it is attacked first, and 1 the most protected as it is attacked last or not at all. These values were then logit-transformed using the formula log[(x + 0.01)/(1 − x + 0.01)] to occupy an unbounded scale, which improved normality of residuals. The 0.01 adjustment in this formula is to ensure that 0 and 1 transform to finite values⁶².

For the discrimination ability experiment, initial preferences were assessed on the basis of bird behaviour in the first session of the training phase only. These preferences would have depended mostly or entirely on the subjects’ innate sensory biases and learning from their experience in the wild and not on their experience with the experimental stimuli (although some learning may have taken place from the very first dish onwards). We used the explanatory variables reward (binary variable: mealworm or no mealworm, here corresponding to fly or wasp stimuli respectively) and feeder (categorical variable identifying the feeding station, here treated as fixed due to only having three levels). We fitted the linear model preference ~ reward × feeder and compared it to all four nested submodels.

To highlight trends of stimulus selection as time progressed in the training phase, we fitted a nonlinear least squares (NLS) model to the levels of protection for the wasp stimulus (as there were only two stimulus types, the pattern for fly stimuli is simply an inversion of this). We used a sigmoid learning curve defined by the formula \(\frac{a}{1+{{\rm{e}}}^{-b\times (t-c)}}\) based on time t measured as the number of sessions. This formula assumes that zero protection is received at time zero.

In the testing phase, we fitted separate curves to each phenotype, using an asymptotic curve with the formula \(a+(b-a)\times {{\rm{e}}}^{-{{\rm{e}}}^{c}\times t}\) as, in contrast to in the training phase, there did not appear to be any initial warm-up period to the rate of learning. This approach enabled us empirically to parameterize the learning period according to the starting level of discrimination, rate of change and final level of discrimination, and therefore to identify a period of learning after which bird preferences were relatively stable. As a result, for our main analysis of the testing phase, we excluded the first 9 days of the testing phase while birds adapted to the new set of stimuli; from day 10 onwards, behavioural responses had reached within 10% of the asymptotic value according to the fitted learning curves. We used the explanatory variables feeder (as for training phase), plus axis (categorical variable for whether the stimuli were based on Mesembrina, Syrphus or Chrysotoxum), phenotype (the degree of similarity to the wasp, categorical to allow for a wide range of non-linear relationships) and edge (binary variable to indicate whether a dish was on the outer perimeter of the 49-dish array, included to improve the fit as we observed that birds preferred to open dishes along the perimeter of the cage; it was not tested for significance). We fitted the Gaussian linear model preference ~ (axis × phenotype + edge) × feeder and compared it to 32 nested submodels. This approach allowed the comparison of models with or without (1) an effect of phenotype, that is the degree of similarity to the wasp; (2) an effect of axis (M. meridiana, S. ribesii or Chrysotoxum), potentially interacting with phenotype; and (3) individual variation in behaviour, according to the interactions with the feeder term, since feeders represent largely separate sets of individual great tits. Tukey’s post hoc comparisons were used to test for differences among different levels of the phenotype and axis variables.

For the multiple-models experiment, again we fitted learning curves using NLS, but found that patterns in the data conformed less closely to simple curve definitions. In the training phase, discrimination initially improved and then appeared to regress after a spell of cold weather (see the ‘Training phase (wild birds)’ section above), possibly owing to turnover of individuals. Asymptotic curves fitted to the whole training phase (formula as described for the discrimination ability experiment) did not converge but, when fitted just to the sessions after the experimental pause, converged for two of the three stimulus types. In the testing phase, asymptotic curves fitted using NLS did not converge on a solution for any stimuli. This is probably because most stimuli showed no clear trends with time, except for M100, which showed an increase in the levels of protection during the first 10 days. In the absence of fitted curves, we used the same cut-off as in the discrimination ability experiment, which matched our subjective assessment of the data trends, removing data from days 1–9 when modelling learned preferences.

We compared models representing several specific hypotheses related to great tit behaviour in the testing phase. We used the explanatory variables edge (as in the discrimination ability experiment), feeder (now fitted as a random effect as there were six feeders), treatment (categorical, one model or two models), distance to model (continuous variable measuring distance to the nearest model along the Vespula–Argogorytes axis of similarity; in the two-model treatment, A150/V−50, A50/V50 and A−50/V150 would all have the same distance (50 percentage points) from a model and would be predicted to elicit similar responses from the predators), intermediate (categorical variable indicating whether a stimulus lies between the two models along the axis of similarity; that is A75/V25, A50/V50 and A25/V75 are intermediate) and stimulus (categorical variable treating each position along the axis of similarity separately). All models used the Gaussian family and included fixed effects of edge and treatment and a random effect of feeder. H0: no effect of stimulus on bird preferences preference ~ edge + treatment + (1|feeder); H1: stimuli receive protection in inverse proportion to their distance from a model preference ~ distance_to_model + edge + treatment + (1|feeder). H2: as for H1, but intermediate mimics receive extra protection preference ~ distance_to_model * intermediate + edge + treatment + (1|feeder). H3: certain stimuli elicit avoidance behaviour in unique and unpredictable ways preference ~ stimulus + edge + treatment + (1|feeder). Each of H1–3 were also tested with the addition of an interaction between treatment and the stimulus-related term (H1 + distance_to_model:treatment; H2 + intermediate:treatment; H3 + stimulus:treatment).

Trait salience experiment

To standardize for variation in speed of behavioural responses of chicks among individual trials, within each trial, we compared the response to probe stimuli against values for the six fly and six wasp presentations. Within a trial, latency to attack was linearly scaled such that values for the response to fly and wasp presentations matched the median values from across all trials (0.855 and 1.28 s respectively). In 14 out of 105 trials, there was little (<0.1 s) or no delay in the mean response towards the wasp stimuli; these trials were excluded from further analysis.

We compared models representing several sets of hypotheses about the relative importance of the four phenotypic traits in influencing chick behaviour. We used the explanatory variables day (categorical variable for the number of days through the testing phase, allowing for changes in behaviour depending how many trials the chick has already completed), batch (categorical variable for which of two groups the chick belonged to, run on different dates), first_pres (binary factor indicating whether or not it was the first presentation of a trial, as we observed chicks to be slower on their first attempt of the trial), chick (random effect for individual ID), shape, colour, pattern and size (each represented by a three level factor of poor (fly-like), good (intermediate between fly and wasp) or perfect (wasp-like)), and interactions to represent overshadowing, where the overshadowed trait is ignored unless another trait, termed the main trait, is above a certain level of accuracy. All models used the Gaussian family and included fixed effects of day, batch and first_pres, and a random effect of chick. H0: no effect of stimulus on chick behaviour latency_to_attack ~ day + batch + first_pres + (1|chick). H1: each trait has a separate, additive effect on behaviour latency_to_attack ~ day + batch + first_pres + shape + colour + pattern + size + (1|chick). Nested submodels were also fit that excluded different combinations of the four trait terms. H2: one trait is assigned as overshadowing others, so that other traits are ignored unless the main trait is perfect (for example, latency_to_attack ~ day + batch + first_pres + colour + colour_perfect:shape + colour_perfect:pattern + colour_perfect:size + (1|chick)). The model was repeated with each of the four traits as the main trait, and nested submodels that excluded different combinations of the overshadowed traits were also fitted. H3: one trait is assigned as partially overshadowing others, so that other traits are ignored unless the main trait is good or perfect (for example, latency_to_attack ~ day + batch + first_pres + colour + colour_good_perfect:shape + colour_good_perfect:pattern + colour_good_perfect:size + (1|chick)), with variations as described for H2. H4: all trait combinations have their own unique effects on chick behaviour latency_to_attack ~ day + batch + first_pres + shape × colour × pattern × size + (1|chick)).

Invertebrate predators experiment

The response variable for mantis behaviour was latency to attack, measured in seconds and modelled using a Gaussian family with log link. Jumping spiders and crab spiders rarely attacked the stimuli directly so instead, response was the number of observations of positive behaviour towards the stimulus: display, approach and attack, and for jumping spiders, alert and orientation. These responses were modelled using a Poisson family with log link. Models included a fixed effect of phenotype (categorical variable for similarity to the wasp, as for discrimination ability above) and random effect of individual (except for the crab spiders, which had only one data point per individual). Tukey’s post hoc comparisons were used to test for differences among different levels of the phenotype variable.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

Data availability

Data have been deposited at the NERC Environmental Information Data Centre and are available online: stimuli: ‘Scanned 3D images and 3D printable images based on combinations of features of Diptera and Hymenoptera collected from the UK in 2021–22’ (https://doi.org/10.5285/05169766-7355-4c3c-8ade-091db0583f9d); wild-bird experiments: ‘Great tit behavioural responses to 3D-printed insect replicas, featuring combinations of traits from wasps and flies, in Madingley Wood, Cambridge, UK, 2021–2023’ (https://doi.org/10.5285/a1c9b0cc-5585-49c5-a38f-fe05240edccf); trait salience: ‘Chick behavioural responses to 3D-printed insect replicas, featuring combinations of traits from wasps and flies’ (https://doi.org/10.5285/45391184-603e-4284-bb3c-9c8c6bf856ab); invertebrate predators: ‘Invertebrate behavioural responses to 3D-printed insect replicas, featuring combinations of traits from wasps and flies, in laboratory trials’ (https://doi.org/10.5285/ee7ba05a-449b-466e-840c-8de1d3f1d4d1). Source data are provided with this paper.

References

Bates, H. W. Contributions to an insect fauna of the Amazon Valley. Lepidoptera: Heliconidæ. Trans. Linn. Soc. Lond. 23, 495–566 (1862).
Article Google Scholar
Ruxton, G. D., Sherratt, T. N. & Speed, M. P. Avoiding Attack: The Evolutionary Ecology of Crypsis, Warning Signals, and Mimicry (Oxford Univ. Press, 2004).
Leavey, A., Taylor, C. H., Symonds, M. R. E., Gilbert, F. & Reader, T. Mapping the evolution of accurate Batesian mimicry of social wasps in hoverflies. Evolution 75, 2802–2815 (2021).
Article PubMed Google Scholar
Penney, H. D., Hassall, C., Skevington, J. H., Abbott, K. R. & Sherratt, T. N. A comparative analysis of the evolution of imperfect mimicry. Nature 483, 461–464 (2012).
Article ADS CAS PubMed Google Scholar
McLean, D. J., Cassis, G., Kikuchi, D. W., Giribet, G. & Herberstein, M. E. Insincere flattery? Understanding the evolution of imperfect deceptive mimicry. Q. Rev. Biol. 94, 395–415 (2019).
Article Google Scholar
Pekár, S. & Jarab, M. Assessment of color and behavioral resemblance to models by inaccurate myrmecomorphic spiders (Araneae). Invertebr. Biol. 130, 83–90 (2011).
Article Google Scholar
Gilbert, F. in Insect Evolutionary Ecology (eds Fellowes M. et al.) 231–288 (CABI, 2005).
Kikuchi, D. W. & Pfennig, D. W. Imperfect mimicry and the limits of natural selection. Q. Rev. Biol. 88, 297–315 (2013).
Article PubMed Google Scholar
Edmunds, M. Why are there good and poor mimics? Biol. J. Linn. Soc. 70, 459–466 (2000).
Article Google Scholar
Pfennig, D. W. & Kikuchi, D. W. Competition and the evolution of imperfect mimicry. Curr. Zool. 58, 608–619 (2012).
Article Google Scholar
Cuthill, I. C. & Bennett, A. T. D. Mimicry and the eye of the beholder. Proc. R. Soc. B 253, 203–204 (1993).
Article ADS Google Scholar
Sherratt, T. N. The evolution of imperfect mimicry. Behav. Ecol. 13, 821–826 (2002).
Article Google Scholar
Endler, J. A. & Mappes, J. Predator mixes and the conspicuousness of aposematic signals. Am. Nat. 163, 532–547 (2004).
Article PubMed Google Scholar
Dell’Aglio, D. D., Troscianko, J., McMillan, W. O., Stevens, M. & Jiggins, C. D. The appearance of mimetic Heliconius butterflies to predators and conspecifics. Evolution 72, 2156–2166 (2018).
Article PubMed PubMed Central Google Scholar
Kikuchi, D. W. & Dornhaus, A. How cognitive biases select for imperfect mimicry: a study of asymmetry in learning with bumblebees. Anim. Behav. 144, 125–134 (2018).
Article PubMed PubMed Central Google Scholar
Ihalainen, E., Rowland, H. M., Speed, M. P., Ruxton, G. D. & Mappes, J. Prey community structure affects how predators select for Müllerian mimicry. Proc. R. Soc. B 279, 2099–2105 (2012).
Article PubMed PubMed Central Google Scholar
Kikuchi, D. W., Dornhaus, A., Gopeechund, V. & Sherratt, T. N. Signal categorization by foraging animals depends on ecological diversity. eLife 8, e43965 (2019).
Article PubMed PubMed Central Google Scholar
Veselý, P., Luhanová, D., Prášková, M. & Fuchs, R. Generalization of mimics imperfect in colour patterns: the point of view of wild avian predators. Ethology 119, 138–145 (2013).
Article Google Scholar
Corral-Lopez, A. et al. Field evidence for colour mimicry overshadowing morphological mimicry. J. Anim. Ecol. 90, 698–709 (2021).
Article PubMed Google Scholar
Wilson, L., Lonsdale, G., Curlis, J. D., Hunter, E. A. & Cox, C. L. Predator-based selection and the impact of edge sympatry on components of coral snake mimicry. Evol. Ecol. 36, 135–149 (2022).
Article Google Scholar
Kauppinen, J. & Mappes, J. Why are wasps so intimidating: field experiments on hunting dragonflies (Odonata: Aeshna grandis). Anim. Behav. 66, 505–511 (2003).
Article Google Scholar
Kazemi, B., Gamberale-Stille, G., Tullberg, B. S. & Leimar, O. Stimulus salience as an explanation for imperfect mimicry. Curr. Biol. 24, 965–969 (2014).
Article CAS PubMed Google Scholar
Dittrich, W., Gilbert, F., Green, P., Mcgregor, P. & Grewcock, D. Imperfect mimicry: a pigeon’s perspective. Proc. R. Soc. B 251, 195–200 (1993).
Article ADS Google Scholar
Green, P. R. et al. Conditioning pigeons to discriminate naturally lit insect specimens. Behav. Processes 46, 97–102 (1999).
Article CAS PubMed Google Scholar
Royama, T. Factors governing the hunting behaviour and selection of food by the great tit (Parus major L.). J. Anim. Ecol. 39, 619–668 (1970).
Article Google Scholar
Pyke, G. H., Pulliam, H. R. & Charnov, E. Optimal foraging: a selective review of theory and tests. Q. Rev. Biol. 52, 137–154 (1977).
Article Google Scholar
Green, D. M. & Swets, J. A. Signal Detection Theory and Psychophysics (Wiley, 1966).
Getty, T. Discriminability and the sigmoid functional response: how optimal foragers could stabilize model-mimic complexes. Am. Nat. 125, 239–256 (1985).
Article Google Scholar
Fisher, R. A. The Genetical Theory of Natural Selection (Clarendon, 1930).
Sherratt, T. N. State dependent risk taking by predators in systems with defended prey. Oikos 103, 93–100 (2003).
Article ADS Google Scholar
Beatty, C. D. & Franks, D. W. Discriminative predation: simultaneous and sequential encounter experiments. Curr. Zool. 58, 649–657 (2012).
Article Google Scholar
Chittka, L. & Osorio, D. Cognitive dimensions of predator responses to imperfect mimicry. PLoS Biol. 5, e339 (2007).
Article PubMed PubMed Central Google Scholar
Beatty, C. D., Beirinckx, K. & Sherratt, T. N. The evolution of müllerian mimicry in multispecies communities. Nature 431, 63–66 (2004).
Article ADS CAS PubMed Google Scholar
Chatelain, P. et al. Müllerian mimicry among bees and wasps: a review of current knowledge and future avenues of research. Biol. Rev. 98, 1310–1328 (2023).
Article PubMed Google Scholar
Bosque, R. J. et al. Diversity of warning signal and social interaction influences the evolution of imperfect mimicry. Ecol. Evol. 8, 7490–7499 (2018).
Article PubMed PubMed Central Google Scholar
Akcali, C. K., Pérez-Mendoza, H. A., Kikuchi, D. W. & Pfennig, D. W. Multiple models generate a geographical mosaic of resemblance in a Batesian mimicry complex. Proc. R. Soc. B 286, 20191519 (2019).
Article PubMed PubMed Central Google Scholar
Sherratt, T. N., Whissell, E., Webster, R. & Kikuchi, D. W. Hierarchical overshadowing of stimuli and its role in mimicry evolution. Anim. Behav. 108, 73–79 (2015).
Article Google Scholar
Hanley, D. et al. Egg discrimination along a gradient of natural variation in eggshell coloration. Proc. R. Soc. B 284, 20162592 (2017).
Article PubMed PubMed Central Google Scholar
Thyselius, M., Gonzalez-Bellido, P. T., Wardill, T. J. & Nordström, K. Visual approach computation in feeding hoverflies. J. Exp. Biol. 221, jeb177162 (2018).
Article PubMed PubMed Central Google Scholar
Taylor, C. H., Reader, T. & Gilbert, F. Hoverflies are imperfect mimics of wasp colouration. Evol. Ecol. 30, 567–581 (2016).
Article Google Scholar
Marples, N. M. Do wild birds use size to distinguish palatable and unpalatable prey types? Anim. Behav. 46, 347–354 (1993).
Article Google Scholar
Pekár, S., Jarab, M., Fromhage, L. & Herberstein, M. E. Is the evolution of inaccurate mimicry a result of selection by a suite of predators? A case study using myrmecomorphic spiders. Am. Nat. 178, 124–134 (2011).
Article PubMed Google Scholar
Carpenter, G. D. H. & Ford, E. B. Mimicry (Methuen & Company, 1933).
Berenbaum, M. R. & Miliczky, E. Mantids and milkweed bugs: efficacy of aposematic coloration against invertebrate predators. Am. Midl. Nat. 111, 64–68 (1984).
Article Google Scholar
Taylor, L. A., Amin, Z., Maier, E. B., Byrne, K. J. & Morehouse, N. I. Flexible color learning in an invertebrate predator: Habronattus jumping spiders can learn to prefer or avoid red during foraging. Behav. Ecol. 27, 520–529 (2015).
Article Google Scholar
Morris, R. L. & Reader, T. Do crab spiders perceive Batesian mimicry in hoverflies? Behav. Ecol. 27, 920–931 (2016).
Article Google Scholar
Rashed, A., Beatty, C. D., Forbes, M. R. & Sherratt, T. N. Prey selection by dragonflies in relation to prey size and wasp-like colours and patterns. Anim. Behav. 70, 1195–1202 (2005).
Article Google Scholar
Bowdish, T. I. & Bultman, T. L. Visual cues used by mantids in learning aversion to aposematically colored prey. The Am. Midl. Nat. 129, 215–222 (1993).
Article Google Scholar
Sontag, C. Spectral sensitivity studies on the visual system of the praying mantis, Tenodera sinensis. J. Gen. Physiol. 57, 93–112 (1971).
Article CAS PubMed PubMed Central Google Scholar
Harland, D. P., Li, D. & Jackson, R. R. in How Animals See the World: Comparative Behavior, Biology, and Evolution of Vision (eds Lazareva O. F. et al.) 132–163 (Oxford Univ. Press, 2012).
Manubay, J. A. & Powell, S. Detection of prey odours underpins dietary specialization in a Neotropical top-predator: how army ants find their ant prey. J. Anim. Ecol. 89, 1165–1174 (2020).
Article PubMed Google Scholar
Moore, C. D. & Hassall, C. A bee or not a bee: an experimental test of acoustic mimicry by hoverflies. Behav. Ecol. 27, 1767–1774 (2016).
Google Scholar
Nguyen, C. V., Lovell, D. R., Adcock, M. & La Salle, J. Capturing natural-colour 3D models of insects for species discovery and diagnostics. PLoS ONE 9, e94346 (2014).
Article ADS PubMed PubMed Central Google Scholar
3DSOM (Creative Dimension Software, 2013).
Blender—a 3D modelling and rendering package v. 2.9 (Stichting Blender Foundation, 2020).
Bône, A. et al. Deformetrica 4: an open-source software for statistical shape analysis. In Shape in Medical Imaging (eds. Reuter, M. et al.) 3–13 (Springer, 2018).
R Core Team. R: a language and environment for statistical computing (R Foundation for Statistical Computing, 2023).
Friard, O. & Gamba, M. BORIS: a free, versatile open-source event-logging software for video/audio coding and live observations. Methods Ecol. Evol. 7, 1325–1330 (2016).
Article Google Scholar
Rilling, S., Mittelstaedt, H. & Roeder, K. D. Prey recognition in the praying mantis. Behaviour 14, 164–184 (1959).
Article Google Scholar
Bates, D., Mächler, M., Bolker, B. & Walker, S. Fitting linear mixed-effects models using lme4. J. Stat. Softw. 67, 1–48 (2015).
Article Google Scholar
Leroux, S. J. On the prevalence of uninformative parameters in statistical models applying model selection in applied ecology. PLoS ONE 14, e0206711 (2019).
Article CAS PubMed PubMed Central Google Scholar
Warton, D. I. & Hui, F. K. C. The arcsine is asinine: the analysis of proportions in ecology. Ecology 92, 3–10 (2011).
Article PubMed Google Scholar
Online 3D Viewer (Viktor Kovacs, 2023).

Download references

Acknowledgements

We thank P. Wilderspin and the staff at the University Farm & Rural Estate (University of Cambridge) for permission to conduct fieldwork in Madingley Wood; T. Fulford, C. Thorne, J. Beaver and the members of the Madingley ringing group for PIT tagging great tits at Madingley; M. Waddle and the Comparative Biology Centre staff for technical assistance in chick husbandry at Newcastle University; D. Starkey for pilot work with jumping spiders and mantises; B. Richter for permission to conduct crab spider experiments at the Quinta de São Pedro field centre; L. Baker for coding of video data; and P. Harris for sharing their HP Jet Fusion expertise. The project was funded by a NERC standard grant (NE/S000623/1), with additional funding from the University of Nottingham, Leverhulme Early Career Fellowship (ECF-2018-700) to G.L.D. and the Max Planck Society and Royal Society (RG110122) to H.M.R.

Author information

Authors and Affiliations

School of Life Sciences, University of Nottingham, Nottingham, UK
Christopher H. Taylor, David James George Watson, Danny Bell, Simon Burdett, Aoife Codyre, Kathryn Cooley, James R. Davies, Joshua Joseph Dawson, Tahiré D’Cruz, Samir Raj Gandhi, Hannah J. Jackson, Rebecca Lowe, Elizabeth Ogilvie, Alexandra Lei Pond, Hallie Rees, Joseph Richardson, Joshua Sains, Francis Short, Francis Gilbert & Tom Reader
Biosciences Institute, Faculty of Medical Sciences, Newcastle University, Newcastle upon Tyne, UK
John Skelhorn
School of Biological Sciences, University of Bristol, Bristol, UK
James R. Davies
The Jolly Geographer, Babraham, UK
Samir Raj Gandhi
School of Mathematical Sciences, University of Nottingham, Nottingham, UK
Christopher Brignell
Department of Psychology, University of Cambridge, Cambridge, UK
Gabrielle L. Davidson
School of Biological Sciences, University of East Anglia, Norwich, UK
Gabrielle L. Davidson
Predators and Toxic Prey Research Group, Max Planck Institute for Chemical Ecology, Jena, Germany
Hannah M. Rowland
Department of Evolution, Ecology and Behaviour, Institute of Infection, Veterinary and Ecological Sciences, University of Liverpool, Liverpool, UK
Hannah M. Rowland
Faculty of Engineering, University of Nottingham, Nottingham, UK
Mark East & Ruth Goodridge

Authors

Christopher H. Taylor
View author publications
Search author on:PubMed Google Scholar
David James George Watson
View author publications
Search author on:PubMed Google Scholar
John Skelhorn
View author publications
Search author on:PubMed Google Scholar
Danny Bell
View author publications
Search author on:PubMed Google Scholar
Simon Burdett
View author publications
Search author on:PubMed Google Scholar
Aoife Codyre
View author publications
Search author on:PubMed Google Scholar
Kathryn Cooley
View author publications
Search author on:PubMed Google Scholar
James R. Davies
View author publications
Search author on:PubMed Google Scholar
Joshua Joseph Dawson
View author publications
Search author on:PubMed Google Scholar
Tahiré D’Cruz
View author publications
Search author on:PubMed Google Scholar
Samir Raj Gandhi
View author publications
Search author on:PubMed Google Scholar
Hannah J. Jackson
View author publications
Search author on:PubMed Google Scholar
Rebecca Lowe
View author publications
Search author on:PubMed Google Scholar
Elizabeth Ogilvie
View author publications
Search author on:PubMed Google Scholar
Alexandra Lei Pond
View author publications
Search author on:PubMed Google Scholar
Hallie Rees
View author publications
Search author on:PubMed Google Scholar
Joseph Richardson
View author publications
Search author on:PubMed Google Scholar
Joshua Sains
View author publications
Search author on:PubMed Google Scholar
Francis Short
View author publications
Search author on:PubMed Google Scholar
Christopher Brignell
View author publications
Search author on:PubMed Google Scholar
Gabrielle L. Davidson
View author publications
Search author on:PubMed Google Scholar
Hannah M. Rowland
View author publications
Search author on:PubMed Google Scholar
Mark East
View author publications
Search author on:PubMed Google Scholar
Ruth Goodridge
View author publications
Search author on:PubMed Google Scholar
Francis Gilbert
View author publications
Search author on:PubMed Google Scholar
Tom Reader
View author publications
Search author on:PubMed Google Scholar

Contributions

Conceptualization: C.H.T., F.G. and T.R. Methodology: C.H.T., D.J.G.W., J. Skelhorn, J.R.D., C.B., G.L.D., H.M.R., M.E., R.G., F.G. and T.R. Formal analysis: C.H.T. Investigation (wild-bird experiments): C.H.T., D.B., S.B., A.C., K.C., J.R.D., S.R.G., E.O., A.L.P. and T.R. Investigation (trait salience experiment): C.H.T. and D.J.G.W. Investigation (invertebrate predators experiment): J.J.D., T.D., H.J.J., R.L., H.R., J.R., J. Sains and F.S. Data curation: C.H.T. Writing—original draft: C.H.T. Writing—review and editing: all of the authors. Visualization: C.H.T., H.J.J. and J. Sains. Supervision: J. Skelhorn, F.G. and T.R. Project administration: C.H.T., J. Skelhorn and T.R. Funding acquisition: C.H.T., J. Skelhorn, C.B., R.G., F.G. and T.R.

Corresponding author

Correspondence to Christopher H. Taylor.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature thanks Thomas Sherratt and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Peer reviewer reports are available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data figures and tables

Extended Data Fig. 1 Changes in preference over time in the Discrimination Ability experiment.

“Level of protection” is the rank order of attack for that stimulus within a session, logit transformed. Higher level of protection indicates that a stimulus was attacked later in the sequence, or not at all. Points show mean and vertical bars show 95% confidence intervals based on the t-distribution. a Training phase. Data show wasp stimuli only; fly data are an almost exact inversion of the data shown. Line shows a sigmoidal curve fitted to the data. N = 828. b Test phase, trends with time. Asymptotic curves fitted to each phenotype. N = 1295 dishes (fly), 558 (75% fly), 553 (50-50), 552 (75% wasp), 1565 (wasp). c Test phase, comparing initial (yellow, sessions 1-3) and asymptotic (black, session 10 onwards, after fitted response reaches within 10% of the asymptote) preferences. Mesembrina axis, N = 191 dishes (initial), 1004 (asymptotic). Chrysotoxum axis, N = 96 (initial), 502 (asymptotic). Syrphus axis, N = 95 (initial), 497 (asymptotic). For images of the stimuli, see Fig. 2 in main text.

Extended Data Fig. 2 Validation experiment testing phase.

Points show mean and vertical bars show 95% confidence intervals based on the t-distribution. Capital letters indicate groupings which show no significant difference in a Tukey post-hoc test (p > 0.05). Sample sizes (number of dishes) are shown at the base of the plot.

Extended Data Fig. 3 Changes in preference over time in the Multiple Models experiment.

Points show mean and vertical bars show 95% confidence intervals based on the t-distribution. a Training phase. Lines show an asymptotic curve fitted to the data from session 16 onwards; the curve for A100 did not converge but is shown for illustration. Session 16 was the first session after a period of one week when no birds opened dishes. Curves fitted to the full time period all failed to converge. A100 session 9 had a small sample size (3) and as a result has very wide confidence intervals (−5.3, 8.3) that are not shown in full for clarity of the rest of the plot. N = 3470 dishes (M100), 769 (A100) and 2227 (V100). b Test phase, trends with time. Asymptotic curves fitted to these data, using the same method as the training phase, failed to converge. Instead, trend lines are a moving average across five sessions (centred on the third session). Sample sizes shown above each plot. c Test phase, comparing initial (yellow, session 1-3) and asymptotic (black, session 10 onwards, chosen to match Discrimination Experiment) preferences. N = 437 dishes (initial, 1 M), 384 (initial, 2 M), 2902 (asymptotic, 1 M), 4334 (asymptotic, 2 M).

Extended Data Fig. 4 Chick latency to attack mimetic stimuli.

Levels of accuracy of mimetic traits are coded as 0 (fly-like/poor), 50 (intermediate/good) and 100 (wasp-like/perfect) for each of shape, pattern, colour and size. Each panel shows a certain combination of colour and size traits, and within a panel, black points show certain combinations of pattern (P) and shape (S), and red points show data pooled across all values for pattern and shape (as shown in Fig. 4, main text). Time to attack (seconds) has been standardized across trials by linear scaling such that values for fly and wasp presentations match the median values across all trials, shown as horizontal reference lines (wasp upper, fly lower). Points show mean and vertical bars show 95% confidence intervals based on the t-distribution. Sample sizes (number of presentations) are shown at the base of each plot.

Extended Data Table 1 Species on which 3D stimuli were based

Full size table

Extended Data Table 2 Comparison of models fitted to data from the testing phase of the discrimination ability experiment

Full size table

Extended Data Table 3 Comparison of models fitted to data from the multiple-models experiment

Full size table

Extended Data Table 4 Trait combinations used in the trait salience experiment

Full size table

Extended Data Table 5 Comparison of models fitted to data from the multiple predators experiment

Full size table

Extended Data Table 6 Behaviours performed by P. audax and S. globosum in trials within the multiple predators experiment

Full size table

Supplementary information

Reporting Summary (download PDF )

Supplementary Data 1 (download ZIP )

3D examples. Five stimuli from the axis M. meridiana to V. vulgaris viewable in 3D (five .obj files each with an associated .mtl file, which provides colour data). Files can be viewed in a variety of 3D applications depending on your operating system, or using an online viewer such as https://3dviewer.net⁶³.

Supplementary Data 2 (download CSV )

Trait salience models. Comparison of models fitted to data from the trait salience experiment. Codes for shape, pattern, colour and size columns are as follows: 0: not included; I: independent; N: nested within one of S (shape), P (pattern), C (colour) or Z (size); and the level of nesting is g (good) or p (perfect). All of the models also included fixed effects for starting presentation (first presentation of a trial), day and batch, and a random effect for chick.

Peer Review File (download PDF )

Supplementary Video 1 (download MOV )

Great tit behaviour. Two great tits opening dishes to obtain mealworms from a range of choices available during a testing trial of the discrimination experiment. The array of 49 lidded dishes is visible (some already opened), along with bamboo perches and the wire mesh which excluded larger birds and mammals from the feeding station.

Supplementary Video 2 (download MOV )

Chick behaviour. A chick responding to four different stimuli during a testing trial of the multiple-traits experiment. The chick removes the lid of the dish in each case, obtaining a mealworm, except in the final example, which shows a wasp stimulus with no reward.

Supplementary Video 3 (download MOV )

Invertebrate behaviour. This video includes three clips. a, A mantis approaching and attacking a wasp stimulus during a testing trial in the invertebrate predators experiment. b, A jumping spider attacking a wasp stimulus during a training trial in the invertebrate predators experiment. c, A crab spider responding to a fly stimulus during a preliminary trial in the invertebrate predators experiment. Note that the spider was not food deprived during this preliminary trial and did not show aggressive behaviour towards the stimulus in this case. Video footage from the crab spider testing trials is not available.

Source data

Source Data Fig. 2 (download CSV )

Source Data Fig. 3 (download CSV )

Source Data Fig. 4 (download CSV )

Source Data Fig. 5 (download CSV )

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Taylor, C.H., Watson, D.J.G., Skelhorn, J. et al. Mapping the adaptive landscape of Batesian mimicry using 3D-printed stimuli. Nature 644, 706–713 (2025). https://doi.org/10.1038/s41586-025-09216-3

Download citation

Received: 27 March 2024
Accepted: 29 May 2025
Published: 02 July 2025
Version of record: 02 July 2025
Issue date: 21 August 2025
DOI: https://doi.org/10.1038/s41586-025-09216-3

This article is cited by

3D printing offers a way to study mimicry by insects
- Thomas N. Sherratt
- Karl Loeffler-Henry
Nature (2025)

Subjects

Abstract

Similar content being viewed by others

Main

Discrimination ability

Multiple models

Trait salience

Invertebrate predators

Discussion

Methods

Production of artificial stimuli

Overview

Specimens

Photogrammetry

3D image processing

Additive manufacturing

Nomenclature

Ethical approval

Wild-bird experiments

Field site and study organisms

Feeding stations

Discrimination ability and multiple-model experiments

Stimuli: discrimination ability experiment

Stimuli: multiple-models experiment

Stimuli: generalization test

Habituation phase (wild birds)

Training phase (wild birds)

Testing phase (wild birds)

Trait salience experiment

Study organisms and housing (chicks)

Experimental arena (chicks)

Stimuli (trait salience)

Habituation phase (trait salience)

Training phase (trait salience)

Testing phase (trait salience)

Invertebrate predators experiment

Study organisms and housing (invertebrates)

Experimental arena (invertebrates)

Stimuli (invertebrates)

Training phase (invertebrates)

Testing phase (invertebrates)

Statistical analysis

Wild-bird experiments

Trait salience experiment

Invertebrate predators experiment

Reporting summary

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Peer review

Peer review information

Additional information

Extended data figures and tables

Supplementary information

Source data

Rights and permissions

About this article

Cite this article

Share this article

This article is cited by

Search

Quick links