Introduction

Within the first year of life, infants begin to learn the rules and structure of language based on statistical cues within the speech that they hear. This includes recognizing word boundaries within continuous streams of speech1,2, and developing sensitivity to grammatical relationships between words in a sentence3,4. Considerable research has demonstrated that nonhuman animals, and particularly nonhuman primates, are also sensitive to these statistical cues5,6,7. Indeed, monkeys learn many of the same statistical properties of these stimuli as humans8,9, and even grammatical structures that were previously thought to be unique to humans10 have recently been shown to be learnable by monkeys11,12. However, despite these cross-species commonalities in statistical learning, human infants are alone in the animal kingdom in going on to acquire language. This suggests that, while critical, these statistical learning processes alone are not sufficient to explain human linguistic abilities13,14,15.

Beyond statistical learning, one ability that is required for language (although is also likely not sufficient), is the ability to learn abstract rules, and to generalize these to novel contexts16. Unlike statistical regularities, which are tied to specific stimuli, these abstract rules do not apply to the perceptual features of any given stimulus, and instead can be generalized to novel contexts. For example, English speakers are able to apply grammatical rules to pluralize unfamiliar nouns (a single ‘wug’ is pluralized to several ‘wugs17) or change the tense of a verb (‘talk’ becomes ‘talked’). This ability to generalize grammatical rules is necessary for language to have such generative power, allowing a limited number of words to be combined in an infinite number of ways, making ‘infinite use of finite means18.

Here, it is important to note a key distinction between two fields of research that use the same nomenclature of ‘rule learning’ in different contexts. A large amount of research has also investigated rule-guided behaviors, in which a behavioral rule dictates which action an animal should perform, and much progress has been made understanding the cognitive and neural systems supporting these processes19,20,21,22. Here, however, we are describing a different type of rule, rules that govern the relationship between stimuli in a sequence, as grammatical rules govern the relationships between words in a sentence.

In a seminal study of this type of rule learning, Marcus and colleagues demonstrated that 7-month old infants are capable of learning simple abstract rules governing the relationship between stimuli in a sequence, and generalizing these rules to novel stimuli16. Infants were habituated to artificial language sequences comprised of three phonemes following one of two rules (‘ABA’ or ‘ABB’). For example, a participant in the ‘ABA’ condition might hear ‘la ti la’, ‘wo fe wo’, etc. They were then presented with testing sequences consisting of novel phonemes that were either consistent with the rule they had been exposed to (e.g., ‘tu po tu’), or were inconsistent (e.g., ‘tu po po’). The infants showed stronger dishabituation responses (that is, they looked towards the audio speaker more frequently), when presented with inconsistent sequences compared to consistent ones, demonstrating not only that they had learned the rule, but that they also spontaneously generalized the rule to new phonemes, and therefore additionally demonstrating acquisition of a higher level representation of the rule. However, while several studies assessing this type of learning have been conducted in animals, their abilities to learn abstract rules, and particularly to generalize them, is less clear.

One of the first studies in nonhuman animals trained rats to associate food rewards with visual sequences following one of three patterns (‘ABA’, ‘AAB’, or ‘BAA’)23. Rats were faster to search for food in the rewarded condition, suggesting that they learned these rules, although this did not require generalization. In a follow up experiment, they trained rats with sequences of tones following the same patterns and tested them with sequences using pitch-shifted stimuli. Again, the rats were faster to search for food in the rewarded condition, suggesting that they generalized, or minimally transposed, the learned rule to tones of higher frequencies23 (but see24,25).

In a field playback study, rhesus macaques were habituated to sequences of monkey vocalizations following ‘ABA’ or ‘AAB’ patterns26. When they then presented test sequences that were either consistent or inconsistent with these patterns, inconsistent sequences produced stronger dishabituation responses. Though the test stimuli were comprised of novel vocalizations, they were still drawn from the same class of monkey calls (i.e., a monkey habituated to ‘coo grunt coo’ would still be tested with ‘coo grunt coo’ vs. ‘coo grunt grunt’ stimuli, even though the individual ‘coo’ and ‘grunt’ vocalizations were novel). This study again suggests the capacity for nonhuman animals to learn these kinds of rules, but the evidence for generalization is less compelling.

Songbirds have also been shown to learn these types of rules, but once again, evidence for generalization is less clear. Zebra finch were trained using go/no-go paradigm in which they were presented with sequences of bird calls and were trained to respond only when they heard a particular sequence structure (e.g., ‘AAB’) and to withhold a response if the structure was different (in this case, ‘ABA’ or ‘BAA’)27. While the birds learned the task, they were unable to generalize to novel stimulus elements (i.e., previously unheard bird calls). Possibly the most rigorous test of this paradigm comes from another go/no-go experiment in zebra finch and budgerigars28. The stimuli were carefully designed to assess whether the birds’ responses could be attributed to features of the individual stimuli (for example a certain song motif might only appear as an ‘A’ stimulus, in the training set). At test, the birds were presented with a range of testing sequences that either consisted of known stimuli in unexpected positions (for example the ‘A’ song motif might now appear as a ‘B’ stimulus), or entirely novel stimuli. All the birds learned the rules, but did not generalize to novel stimuli, instead responding based primarily on the specific positions in which the individual stimulus elements occurred. However, in a follow up experiment, budgerigars did appear to show some generalization of the rule to sequences consisting of novel birdsong motifs28.

These studies all suggest that a range of nonhuman animals are capable of learning these simple ‘ABA’, ‘BAA’, and ‘AAB’ sequence structures. However, the critical cognitive ability that this paradigm was designed to assess is generalization of these abstract rules to new contexts, and evidence for generalization is much less clear, particularly in nonhuman primates. Here, we conducted three experiments in monkeys and humans with the goal of assessing abstract rule learning and generalization. Our experiment differed from previous work in a number of ways.

Firstly, we used sequences of visual rather than auditory stimuli. This avoided any risk of obtaining spurious species differences due to differences in auditory perception or working memory29. Using a visual design also allowed us to present three stimulus sequences simultaneously, in a three-alternative forced-choice (3AFC) task (see Fig. 1). This approach allowed operant training of the animals (as in birds27,28, without requiring a go/no-go task. This is important because when an animal is trained to respond only to a certain type of stimulus (the ‘go’ condition), the presentation of a novel stimulus might be more likely to elicit a ‘no-go’ response, potentially occluding generalization effects. The 3AFC design requires a response on every trial, avoiding this potential bias.

Fig. 1
figure 1

(A) Trial design and stimuli. In all trials, three sequences each consisting of three stimulus elements following the rules ‘ABA’, ‘AAB’ and ‘BAA’ were presented simultaneously on the screen. These stimuli were either colored squares or white shapes and were randomly generated for each trial (see Methods). For each participant, one rule was assigned as ‘correct’, and participants received positive reinforcement for selecting this stimulus sequence. (B) Experimental design. Participants initially learned the color rule (Phase 1.1), before we assessed generalization across stimulus domains to shape stimuli (Phase 1.2). A subset of monkeys and all humans were then trained on the shape rule (Phase 2.1). We then confirmed that they still remembered the color rule (Phase 2.2), before changing that rule (e.g., from ‘AAB’ to ‘BAA’; Phase 2.3). Finally, we again assessed generalization across stimulus domains, asking if this newly learned color rule would be generalized to the shape stimuli (Phase 2.4). Blue backgrounds denote phases which included shape stimuli.

Secondly, we approached the question of generalization differently to previous studies. Prior experiments initially trained animals with a small set of stimuli for a long period of time (zebra finches were trained with the same 10 stimulus sequences for ~ 13000 trials during training28, before introducing the generalization test. It is possible that large amounts of training on a small set of stimuli may actually inhibit subsequent generalization, due to overlearning. To avoid this issue, we generated novel ‘A’ and ‘B’ stimuli for every trial (see Methods), so that our task required rule generalization from the outset.

Finally, beyond assessing generalization to different exemplars of the same stimulus category (different phonemes in infants16, tones in rats23, vocalizations in monkeys26 and song elements in birds27,28, we wanted to further probe the flexibility of rule generalization in monkeys and humans. Therefore, as well as presenting novel sequences on every trial, we included testing phases which required generalization to an entirely different stimulus domain (from rules based on color to shape).

Together, these methods allowed us to identify rule learning and generalization in monkeys beyond what has previously been reported in nonhuman animals. However, our results also highlight important differences between humans and monkeys, particularly in the rate of rule learning, flexibility of generalization and the nature of the rule representations formed. These experiments provide important insights into the evolution of cognitive mechanisms supporting rule learning and generalization.

Results

Experiment 1: monkeys learn abstract rules and generalize to novel stimuli

Rhesus macaques (n = 6, ages: 5–6 years) were simultaneously presented with three sequences of colored stimuli conforming to three abstract rules (‘ABA’, ‘AAB’, and ‘BAA’) in a three-alternative forced-choice (3AFC) touchscreen task (Phase 1.1, see Fig. 1). Monkeys were each assigned one of these rules (e.g., ‘AAB’) as ‘correct’ (counter-balanced between animals). If monkeys selected the correct stimulus, they received positive feedback (a green screen, positive ‘ding’ sound, and a food reward), while if they selected any other stimulus, they received negative feedback (red screen, negative ‘beep’ sound and no food reward). On each trial, the colored stimuli were randomly generated (see Methods), thus requiring learning and generalization of a higher level abstract rule, rather than focusing on any perceptual feature of the individual stimulus elements.

Initially, monkeys responded at random, but after a considerable learning period (several thousand trials, see Fig. 2, S1-2) all the animals began to consistently select the ‘correct’ stimuli. Learning was gradual, with monkeys taking more than 10,000 trials each to reach our criterion for success, and many thousands of trials to first exceed the 95% confidence interval (Table S1). However, all six animals ultimately achieved very high levels of learning (chi-squared test on final learning block, p < 0.0001 for all monkeys; one-sample t-test across animals, t5 = 24.44; p < 0.0001; Table 1), demonstrating that they were able to learn the abstract rule, and that they generalized it to novel colored stimuli on a trial-by-trial basis.

Table 1 Statistical tests for experiment 1. Chi squared tests were used for within-individual analyses, and t-tests were used for group level analyses. Effect sizes for the chi-square tests were calculated using the formula, Cramer’s V, \(\:V=sqrt(X^2/(n*df\left)\right)\), where X2 is the chi-squared statistic, n is the number of trials and df is the degrees of freedom. Effect sizes for t-tests were calculated using the formula, Cohen’s d = (sample mean – expected mean)/sample standard deviation.
Fig. 2
figure 2

Abstract rule learning in six rhesus macaques. Phase (1.1) Over time, all monkeys learned the abstract rules, requiring that they generalized to novel colored stimuli on each trial. Colored lines show the proportion of trials on which the stimuli conforming to each rule type were selected (blocks of 200 trials). Thicker lines denote the ‘correct’ rule for each monkey. Dashed line shows chance performance and shaded gray areas denote 95% confidence intervals (calculated by permutation test). Phase (1.2) Performance remained very high on standard (color) trials (left; p < 0.0001 in all cases), but monkeys show no generalization on probe (shape) trials (right, blue background; p > 0.05 in all cases). These data are also visualized using a 200 trial sliding window (S1) and aggregated by testing day (S2).

To further assess the generalization of these rules, we then tested whether monkeys would spontaneously apply these rules to a different stimulus domain, from colors to shapes (Phase 1.2). In this phase, 90% of trials were ‘standard trials’ using color stimuli identical to the previous phase. The remaining 10% of trials were ‘probe trials’, on which three sequences consisting of different shapes were presented instead of the color stimuli (see Fig. 1 and Methods). The shape stimuli conformed to the same ‘ABA’, ‘AAB’ and ‘BAA’ rules as the color stimuli. On these probe trials no auditory or visual feedback was provided, and the macaques were rewarded regardless of their choices. On the standard, color trials, as expected, all monkeys continued to select sequences conforming to the ‘correct’ rule at very high levels (individual analyses: p < 0.0001 in all cases; group analysis: t5 = 24.85; p < 0.0001; Table 1, Fig. 2, S4, Table S2). However, none of the six monkeys differed from chance levels when classifying the shape stimuli in the probe trials (individual analyses: p > 0.05 in all cases; group analysis: t5 = −0.132; p = 0.901; Table 1, Fig. 2). These results suggest that while monkeys are able to acquire abstract rules and generalize them across stimuli within a single domain (color), this generalization does not automatically extend to a new stimulus domain, hinting at limits to the flexibility with which monkeys apply abstract rules.

Experiment 2: monkeys fail to generalize across stimulus domains

Despite demonstrating generalization of abstract rules to novel colored stimuli, Experiment 1 found no evidence that monkeys generalized across stimulus domains, from color to shape stimuli. Experiment 2 tested this further in a subset of the macaques tested previously (n = 2, the first animals to complete Experiment 1) to assess whether this generalization could be elicited by further training. We trained monkeys that both the color and shape sequences followed the same rule, then changed the rule that applied to the color sequences. The goal of this experiment was to ask whether the monkeys would generalize and apply this new color rule to the shape sequences, or persist in using the previous rule, suggesting more restricted, domain specific rule representation.

Immediately following the completion of Experiment 1, two monkeys began Experiment 2. Trials were identical to Experiment 1, except that we now presented shape stimuli instead of the color stimuli (Phase 2.1). For each monkey, the sequence following the same ‘correct’ rule as in Experiment 1 (e.g., ‘AAB’) was rewarded. Despite having already learned which rule applied to color stimuli, the monkeys showed no facilitation benefit in learning the shape rule. In fact, the monkeys showed no evidence of learning the shape rule for approximately 20,000 trials each (Fig. 3, S3). In an attempt to encourage learning, an animation was added to the stimuli, causing first the ‘A’ then the ‘B’ elements to bounce up and down, to draw attention to the difference between the ‘A’ and ‘B’ stimuli in each sequence. This addition was successful. Both monkeys eventually learned to classify the shape sequences, and this pattern of performance persisted after the removal of the animation (p < 0.0001 in both cases, Table 2). Total training on the shape stimuli required 50,000–60,000 trials. Following successful learning of the shape rules, the monkeys were moved back to training on the color stimuli, to ensure that performance remained at high levels (Phase 2.2). The monkeys either remained at criterion performance or quickly returned to this level (p < 0.0001, Fig. 3, S3, Table 2).

Fig. 3
figure 3

Monkeys did not generalize between stimulus domains. Monkeys (n = 2) progressed through several different experimental phases. When trained using shape stimuli (Phase 2.1), monkeys initially showed no learning. The addition of an animation to highlight the ‘A’ and ‘B’ stimuli (by making them briefly ‘bounce’ up and down) successfully induced learning, which persisted after the removal of the animation. After the monkeys reached our learning criterion, we moved them back to color trials to confirm they remembered the color rules (Phase 2.2). Then, the color rules were changed, to put them in opposition with the shape rules (Phase 2.3). Finally, we conducted a final generalization task to ask if monkeys would generalize the new color rule to the shape stimuli (Phase 2.4). The monkeys did not, and both animals showed a significant preference for the previously learned shape rule (* = p < 0.01).

Table 2 Statistical tests for experiment 2. Chi squared tests were used for within-individual analyses. Effect sizes were calculated as in Table 1.

At this stage, the monkeys had learned that both the color and shape stimuli followed the same abstract rule. To further assess generalization, we then changed which rule dictated the ‘correct’ color sequence (e.g., from ‘AAB’ to ‘BAA’), requiring the monkeys to change their strategy (Phase 2.3). The monkeys learned this successfully, but again, learning was slow and incremental, taking more than 10,000 trials in both cases (p < 0.0001, Fig. 3, S3, Table 2). This implies that rather than simply switching which rule they applied, the monkeys had to unlearn the previous rule and then re-learn a new one, suggesting limits on the flexibility with which these rules are applied.

By now, the monkeys had learned that while the same rule (e.g., ‘AAB’) had previously applied to both the color and the shape stimuli, the color rule had changed (e.g. to ‘BAA’), potentially putting the rules in opposition with one another. We then presented the monkeys with infrequent probe trials consisting of shape stimuli (Phase 2.4). In this phase, if the monkeys spontaneously applied the new color rule to the shape stimuli, this would suggest broad, cross-domain generalization. By contrast, if they persisted with the old shape rule, despite the change to the color rule, this would suggest a more restricted, context specific representation of the rule. Both monkeys showed the latter pattern, showing a modest but significant preference for the prior shape rule (p < 0.01 in both cases, Fig. 3, S3, S5, Table 2), rather than generalizing the new color rule to the shape stimuli. This result also implies that the monkeys maintained a memory of a rule they last encountered more than 10,000 trials previously, in Phase 2.1 (Fig. 3, S3). These results (and the notable consistency between the animals) suggest that while monkeys are capable of (slowly) acquiring abstract rules and generalizing them to novel sequences within a stimulus domain, this generalization is somewhat restricted and domain specific, with no generalization across stimulus domains.

Experiment 3: humans rapidly learn and spontaneously generalize abstract rules, within and across stimulus domains

Experiments 1 and 2 highlighted abilities and constraints on monkeys’ rule learning and generalization. Experiment 3 tested human participants (n = 48, mean age = 18.8 years) using almost identical methods, with the only difference being fewer trials required to achieve criterion to advance between phases (see Methods).

As expected, human participants performed at very high levels. When presented with sequences of colored stimuli (Phase 1.1), all participants almost immediately learned to select the correct rule (one sample t-test, t47 = 37.53; p < 0.0001; see Table 3, S6). In the subsequent shape generalization test (Phase 1.2), participants continued to perform at high levels on the color trials (t47 = 420.47; p < 0.0001), and every participant spontaneously selected the shape stimuli conforming to the same rule as the color stimuli (t47 = 57.52; p < 0.0001; Fig. 4A). This result contrasts starkly with the monkeys’ performance, which showed no evidence of generalization (compare Phase 1.2 in Figs. 2 and 4).

Table 3 Descriptive statistics for human performance on each learning phase of experiment 3. Each phase continued until the participants reached our learning criterion (performance exceeding 80% in the preceding 40 trials), and average trials to complete these phases are reported.
Fig. 4
figure 4

Humans showed spontaneous rule generalization across stimulus domains. (A) In Phase 1.2, human participants (n = 48, grey circles) performed at very high levels on the standard (color) trials, and all spontaneously generalized to the probe (shape) stimuli. For ease of display, data from human participants are collapsed across all rules (‘AAB’, ‘ABA’, ‘BAA’) based on whichever rule was ‘correct’ for each participant. (B) In Phase 2.4, after learning that the rule which applied to the shape stimuli had changed, every participant generalized this new rule to the shape stimuli, rather than persisting with the prior shape rule.

Human participants then rapidly proceeded through the subsequent phases, corresponding to monkey Experiment 2. Given that all participants spontaneously generalized the color rule to the shape stimuli, participants completed Phases 2.1 (t47 = 196.4; p < 0.0001) and 2.2 with almost no errors (t47 = 427.55; p < 0.0001; see Fig. 1b; Table 3). When we changed the rule that applied to the colored stimuli (Phase 2.3), participants very quickly learned the new rule and then returned to ceiling performance (t47 = 157.68; p < 0.0001). Finally, when presented with the final generalization test (Phase 2.4), every participant, without exception, spontaneously applied the new color rule to the shape probes, despite having previously learned that a different rule applied to these stimuli (standard color trials: t47 = 263.6; p < 0.0001; probe shape trials: t47 = 42.53; p < 0.0001), again, showing a very different pattern of responses to the monkeys (compare to Phase 2.4 in Figs. 3 and 4).

These results highlight that while both monkeys and humans were able to acquire abstract rules and apply them to new stimuli on a trial-by-trial basis, the pattern of learning across species was markedly different.

Discussion

The goal of this study was to directly compare how monkeys and humans learn and generalize abstract rules, using comparable methods across species. Our results, as we will discuss below, provide evidence for two main conclusions. The first is that macaques may be capable of more complex, higher-level rule learning than has previously been reported in non-human primates. Secondly, our results also highlight qualitative differences in the ways in which humans and monkeys learn and represent abstract rules, hinting at meaningful cognitive differences between the species.

Returning to our first conclusion, our experiments, which required generalization to novel stimuli on a trial-by-trial basis, provide the strongest evidence to date of abstract rule learning in monkeys. These rules go beyond the types of regularities present in statistical learning tasks, which are tied to specific perceptual properties of the individual stimuli, and instead require higher order representations of the relationships between stimuli, independent of specific stimulus properties. The level of generalization reported here also extends beyond what has been observed in nonhuman primates previously. For example, although prior work26 showed that monkeys were sensitive to rules governing sequences of vocalizations (macaque coos and grunts), they only showed generalization to novel calls of the same type. Therefore, the monkeys may have encoded the stimuli as ‘coo-grunt-coo’ (or given the specific testing sequences used, even ‘ends with coo’), rather than a higher order representation of the relationships between the stimuli, independent of the specific stimuli themselves. By contrast, our results demonstrate abstract rule learning and generalization to novel colored stimuli on a trial-by-trial basis, which cannot be based on any perceptual features of the stimuli. The learning and generalization of abstract rules is critical to human language and cognition, therefore evidence of similar abilities in nonhuman primates represents an important and meaningful advance in understanding primate cognitive evolution.

Regarding our second conclusion, our experiments revealed important features of rule learning in monkeys that have not previously been evident using more traditional experimental approaches. The use of almost identical methods and stimuli in monkeys and humans allows for direct cross-species comparisons and minimizes the likelihood that the differences observed could be attributed to different methodologies.

Firstly, in line with prior studies using operant approaches in nonhuman primates11, a clear point of difference across the species is evident in the learning rates of humans and monkeys. Learning, particularly of rules governing shape stimuli, was extremely slow and laborious for monkeys. While all human participants learned and generalized these rules almost immediately (under 10 trials in all participants), the monkeys required many thousands of trials to reach similar levels of performance. Moreover, all six of the monkeys tested showed a highly consistent pattern of learning, featuring a protracted period of several thousand trials where no learning occurred at all, before beginning to slowly learn which rule applied (see Fig. 2). This may suggest that, unlike the humans, the monkeys initially attended to perceptual features of the stimuli, and only over time learned to attend to the abstract rule. These results suggest that relative to monkeys, humans have a much stronger predisposition to notice and identify higher order rules encoding the relationships between stimuli.

Secondly, similar to other studies assessing rule-guided behaviors in a range of species these experiments demonstrated that when rules were changed (Phase 2.3), monkeys were very slow to adopt these new rules. Specifically, these results suggest that rather than switching between a set of possible rules, the monkeys appeared to have to unlearn the previous rule-based response strategy (as evidenced by perseverative errors), and then relearn a new rule from scratch (for example, in Fig. 3 note the period of approximately 5000 trials in which Zelda remained solidly at chance performance after unlearning one rule but before identifying the correct one). Again, in contrast, all human participants identified the new ‘correct’ rule within a few trials and adopted this consistently thereafter. These results may suggest that humans automatically encoded all three of the sequences of stimuli as following different rules, but understood that only one of those rules was ‘correct’ at any given time. When the previously ‘correct’ rule stopped being rewarded, they then rapidly explored which of the other rules may now be ‘correct’. By contrast, the monkeys seem to have identified which rule is correct, but do not appear to have a representation of the rules governing the other sequences that they could then switch to. While it is likely that, given more experience with rule changes, monkeys would have learned to switch more quickly, this result suggests a potential species difference in how these rules are represented.

Finally, we showed that while monkeys can form abstract rules within one category of stimuli (e.g., color), they do not show spontaneous generalization to other stimulus domains, from colors to shapes. Experiment 2 further demonstrated that even with additional training, monkeys failed to generalize across stimulus categories. Importantly, these results stand in stark contrast to our data in humans (Experiment 3), all of whom immediately and spontaneously generalized these abstract rules across stimulus classes, both initially (Phase 1.2) and following a change in the rule (Phase 2.4). Note that there was no specific expectation that human participants would show this generalization; it would have been entirely reasonable to assume that the two different categories of stimuli followed different rules. Therefore, it is particularly noteworthy that every human participant instead generalized across domains, unlike the monkeys. Together, these results suggest that monkeys appear to form multiple independent representations of rules that are specific to a stimulus domain, and which do not appear to interact with one another. By contrast, humans spontaneously formed a single higher level rule, allowing generalization across stimulus domains. These results imply a fundamental difference in how these rules are represented across species.

These three findings, about the rate of learning, difficulty in rule switching, and lack of generalization across stimulus domains, highlight a stark contrast between monkeys and humans. This leads to the second key conclusion stated above, that humans and monkeys seem to use fundamentally different rule learning strategies, instantiated by different cognitive rule representations across species.

This conclusion is in line with findings from other areas of cognition. For example, relative to adult humans, monkeys show more domain specificity in physical reasoning (including a dissociation between spatiotemporal and contact mechanic30. Great apes show domain specific use of object properties (color, shape) for object identification and individuation31. Finally, while humans spontaneously form symmetrical, reversible symbolic relationships between stimuli (if A is associated with B, then B is also associated with A), both behavioral and neural evidence demonstrate that monkeys form a more restricted, specific, directional association, and do not show this symmetry or reversibility effect32,33,34. Our data, alongside these and other studies in a wide range of areas of cognition, suggest that a core difference might be in the specificity of relationships and the depth of abstraction that humans are capable of relative to other animals.

These potential differences have important implications for comparative research into the evolution of language. For the last 20 years (since Hauser et al., 200235), there has been a substantial amount of research investigating the types of grammatical rules that nonhuman animals are capable of learning (for reviews, see36,37). While this has been a fruitful and worthwhile endeavor, our present results suggest that an alternative approach might also be beneficial. Rather than focusing only on what nonhuman animals are able or unable to learn, we argue that the field should also consider how that learning proceeds, and the nature of the cognitive representations that are formed. This argument is in line with recent work proposing that humans, uniquely, possess internal symbolic languages of thought which rapidly and spontaneously identify, encode and compress structures and patterns in environmental stimuli14. By contrast, nonhuman animals, including primates, appear to process information fundamentally differently, using more classical associative processes38. While the current study does not require recursive, symbolic mental languages, even this relatively simple task was able to reveal stark differences between human and nonhuman primates, in terms of learning rates, cognitive flexibility, and patterns of generalization. Alongside evidence for abstract rule learning and generalization in monkeys, these data highlight fundamental cognitive differences in how humans and macaques learn and represent abstract rules.

Limitations of the study

Our study compared rule learning in rhesus macaques and adult human participants, who had decades of experience with abstract rules in language and other domains. Differences in language experience, attention, perception or motivation between the human participants and the monkeys may contribute to the species differences observed here. This raises the possibility that language experience may account for (some of) the differences we observed, therefore future testing with children would be beneficial. Secondly, it is possible that with more experience of rules changing, monkeys may learn to switch strategies more quickly. Finally, given the small number of monkeys available, all monkeys were trained with colored stimuli before shape stimuli, therefore it was not possible to test generalization in the reverse direction, from shapes to colors.

Methods

Participants

Monkeys

In Experiment 1, six rhesus macaques (Macaca mulatta, all female, ages 4–5 years) participated in the experiment. Due to the large number of trials required, only two of the same macaques participated in Experiment 2. These sample sizes are comparable to other studies in the fields of comparative cognition and neuroscience (e.g.,11,39). No animals or data were excluded from any analyses. Macaques received food rewards during their participation in our task. To avoid a calorie surplus, feedings were scheduled for the end of the day (after 4:00pm) to ensure only the remaining necessary calories were provided. Macaques were fed nutritionally complete primate biscuits (LabDiet), as well as produce and enrichment. The monkeys were born and reared in their family groups at Emory National Primate Research Center. When they reached maturity, they were transferred to the laboratory, and were pair-housed. The monkeys were temporarily separated by an opaque, plastic divider while participating in the experiments. All monkeys were housed in the same types of cages in the same room, and were tested at the same time, to minimize any potential confounds of testing time or location. Ethics were approved by Emory University IACUC (PROTO202100030), and all procedures were in line with ARRIVE guidelines40. All method were conducted in accordance with relevant guidelines and regulations.

Humans

In Experiment 3 we recruited 48 adult human participants (mean age, 18.8 years; 34 female, 13 male, 1 nonbinary) through the Emory Psychology SONA system for research credit. Ethics were approved by Emory University Institutional Review Board (IRB STUDY00003577), all research was performed in accordance with all relevant guidelines and regulations including the Declaration of Helsinki, and informed consent was obtained in all cases. All data was anonymized before publication.

Stimuli

On each trial of the experiment, participants were presented with three sequences each containing three stimulus elements, generated by three different rules: ‘AAB’, ‘ABA’, and ‘BAA’ (see Fig. 1). For each participant, one of these rules was assigned as ‘correct’ (pseudo-randomized and counterbalanced across participants) and selecting stimuli conforming to that rule would be rewarded (see below). Each sequence contained the same set of two ‘A’ elements and 1 ‘B’ element, so it was impossible to select the correct sequence based on the presence of these individual elements. Moreover, for each trial the specific ‘A’ and ‘B’ elements were randomly generated (see below), and any stimuli were randomly assigned as ‘A’ or ‘B’ on any given trial. Therefore, participants could not solve this task by attending to individual stimulus features, and instead must rely on the higher level, configurational rules governing the sequences.

Two types of stimulus elements were used in the experiment. The first were colored squares (see Fig. 1A, left). The colors were generated by randomly selecting RGB values, with the only restriction being that if the colors of the two stimulus elements on any given trial were too similar (if the Euclidian distance in RGB color space between the two colors was less than 0.8), new colors were randomly generated, to ensure that the stimulus elements were easily discriminable. The second type of stimulus elements were abstract white shapes on black backgrounds (see Fig. 1A, right). The shape stimuli were also randomly generated for each trial using custom MATLAB code, again ensuring that novel stimuli were used on every trial. We randomly generated polygons with between 3 and 12 sides. To ensure that the shapes were easily discriminable, we specified that the number of sides must differ by at least 4 between the two shapes (e.g., a 4 sided shape and an 8 sided shape). We also made one shape appear more rounded, by smoothing the edges, again, to increase discriminability.

To ensure that these shape stimuli were easily discriminable to the monkeys, in a separate follow up experiment we presented three monkeys with a single example of the ‘A’ and ‘B’ stimuli on each trial. Selecting one of these stimuli was rewarded (S+) and the other stimulus was not (S-). When presented with new stimuli, the monkeys very rapidly learned which stimulus was rewarded, and selected this stimulus on almost 100% of trials, demonstrating that they could easily discriminate shape stimuli. In a final follow up experiment, we repeated this procedure using sequences of the same shapes (i.e., ‘AAA’ and ‘BBB’), and again, monkeys readily discriminated between the shape sequences (Table S3).

The Matlab code that was used to generate the color and shape stimuli is available at https://osf.io/vawh6/.

Apparatus

Monkeys were tested in their home cages, using custom designed touchscreen testing systems mounted to the front of the cage. They were able to interact freely with the screens for approximately 6 h per day, 5 days per week. When monkeys selected the correct stimulus sequence, they received a food reward (nutritionally complete primate pellet, 97 mg, TestDiet). The experiments were coded and run in MATLAB 2022b (Mathworks, USA) through the Psychophysics Toolbox41 on laptop computers (Dell) using touchscreen monitors (ELO).

Human participants were tested in custom designed testing laboratories within the Department of Psychology at Emory University. Like the monkeys, the participants made responses using touchscreen monitors (Dell). The human participants received no explicit instruction about the task and were only told that they should try and select the ‘correct’ stimuli by touching the screen.

Procedure

The participants progressed through several experimental phases (see Results, and Fig. 1B). On all trials, participants were presented with three sequences of stimulus elements, following different rules (‘ABA’, ‘ABB’, and ‘BAA’) on a touchscreen computer. Participants selected one of the three sequences and received either positive (a green screen and positive ‘ding’ sound and a food pellet (macaques only)) or negative feedback (a red screen, and a negative ‘beep’ sound, a brief timeout and no food reward). For the macaques, for all phases, our criterion to advance was performance of at least 70% (chance performance is 33%) for at least three out of five consecutive testing days. For humans our criterion was at least 80% performance over 40 trials.

Data analysis

For both humans and monkeys, a within-subjects design was used. Our outcome measure was the proportion of trials on which participants selected the ‘correct’ stimulus sequence, and this was compared to chance performance (33.3%). Given our relatively small sample size, we present individual analyses in each monkey (chi-squared tests), as well as group level statistics (one-sample t-tests). By reaching our learning criterion (Phases 1.1, 2.1, 2.2, 2.3) all animals necessarily performed very significantly above chance levels (p < 0.0001, chi-squared tests on final learning block of 200 trials), but these statistics are presented for completeness. In our generalization tests (Phases 1.2 and 2.4) we analyzed performance on all of the probe trials (500) and the final 500 standard trials using chi-squared tests (individual analyses) and one-sample t-tests (group level analyses). Human data from Experiment 3 were analyzed using one-sample t-tests for each phase, and performance was compared to chance levels (33.3%). All analyses were automated and were performed in MATLAB 2022B.