Addressing the theory crisis in statistical learning research

Conway, Christopher M.; Jenkins, Holly E.; Milne, Alice E.; Singh, Sonia; Wilson, Benjamin

doi:10.1038/s41539-025-00359-6

Download PDF

Perspective
Open access
Published: 29 September 2025

Addressing the theory crisis in statistical learning research

Christopher M. Conway¹,
Holly E. Jenkins²,
Alice E. Milne^3,4,
Sonia Singh⁵ &
…
Benjamin Wilson^6,7

npj Science of Learning volume 10, Article number: 68 (2025) Cite this article

355 Accesses
3 Altmetric
Metrics details

Subjects

Abstract

Research into statistical learning, the ability to learn structured patterns in the environment, faces a theory crisis. Specifically, three challenges must be addressed: a lack of robust phenomena to constrain theories, issues with construct validity, and challenges with establishing causality. Here, we describe and discuss each issue in relation to several prominent statistical learning phenomena. We then offer recommendations to help address the theory crisis and move the field forward.

Changes in statistical learning across development

Article 02 March 2023

Recovering autonomous work after the pandemic: analysis in Calculus for incoming Students in Technical Education degrees

Article Open access 24 December 2024

Heterogeneity in strategy use during arbitration between experiential and observational learning

Article Open access 24 May 2024

Introduction

The capacity to learn from experience is critical for survival and therefore ubiquitous across most organisms. An important type of learning, called statistical learning, involves sensitivity to structured patterns in the environment^1,2,3,4. Patterns come in many varieties, such as how certain objects typically appear (e.g., animals, trees, and human-made objects each have a particular form and shape) to the regularities found in temporal input streams such as music and language (e.g., hearing random notes on a piano would not be considered music, just as a random concatenation of speech syllables would sound meaningless).

Decades of research on statistical learning, and its related cousin, implicit learning^5,6, has demonstrated that humans (and other organisms) are sensitive to spatial and temporal patterns across many perceptual, motor, and cognitive domains^7,8. This type of learning appears to occur relatively automatically, without intention to learn, and results in behavioral, cognitive, and/or perceptual facilitation that is often (though not always) accompanied by a lack of conscious awareness of the learned patterns. Because of the apparent reach of statistical learning to explain functioning across a myriad of domains and situations, it is perhaps not surprising that this area of research has been exploding in recent years^9,10.

However, recent critiques have highlighted challenges facing statistical learning research. Namely, the construct of statistical learning itself is underspecified in terms of its underlying neurocognitive mechanisms, how these mechanisms relate to various laboratory tasks commonly used to measure statistical learning, and what is the relationship between statistical learning and other forms of learning, memory, and cognition^{9,11,12,13,14}. While some theoretical frameworks do exist, including the extraction-integration framework^15,16,17,18, chunking approaches^9,19,20, frameworks favoring the role of brain plasticity and modality constraints^2,7,21, and other multi-component models^22,23,24,25, there is no universally-agreed upon theory that can adequately address these issues.

We contend that these challenges are not unique to statistical learning research but in fact are difficulties confronting the psychological sciences more broadly. It has been argued that the field of psychology faces a “theory crisis”; that is, psychological theories are often poor at explaining the relevant phenomena of interest and no amount of improved statistical techniques or replication studies will improve the situation^26,27,28. Whereas efforts have been made to encourage psychologists to create more formal or precise theories²⁹, Eronen and Bringmann³⁰ have recently argued that “the core of the problem is that developing good psychological theories is extremely difficult” (p. 780). Therefore, understanding the obstacles facing theory building in psychology is the first necessary step for making progress in addressing the theory crisis.

In this paper, we use Eronen and Bringmann’s³⁰ and others’ insights about the theory crisis as a foundation for moving the field of statistical learning forward, one that has seen an explosion of growth in recent years⁹. Statistical learning, and even implicit learning, have historically been challenging to define, as different tasks may elicit different cognitive processes and engage distinct neuroanatomical systems (for a review, see ref. ¹⁴). Therefore, for the purposes of this paper, we define statistical learning prototypically, as encompassing situations that tend to have a specific set of characteristics, in particular, those that involve the learning of patterns in the environment, through multiple exposures, and under incidental conditions⁷. Note that this definition can be thought of as a higher-level description about what problem or goal the system has evolved to address—i.e., to learn environmental patterns over multiple exposures under incidental conditions—rather than a definition predicated on identifying the learning processes or neural mechanisms involved. This higher-level description is similar to Marr’s³¹ top level of analysis (what he calls the “computational” level), whereas descriptions of the cognitive and neural mechanisms fall under Marr’s 2nd (“algorithmic”) and 3rd (“implementational”) levels, respectively (see Conway et al.³², for elaboration of Marr’s framework in relation to statistical learning). The advantage of starting with such a higher-level definition is that it is based on the characteristics of the learning situation, rather than on the characteristics of the learner, which are “unseen” cognitive and neural processes that can be difficult to identify¹⁴. Thus, this definition provides (what we believe is) a relatively uncontroversial description of the types of situations that statistical learning typically encompasses and that clearly specifies to what extent any given task or situation is relevant to this area of research.

With this definition in place, the ultimate aim should be to derive a theoretical framework that provides meaningful and valid conceptualizations of the various phenomena that statistical learning touches upon. To work towards this goal, in the following sections we first describe three key issues as laid out by Eronen and Bringmann: lack of robust phenomena, issues with construct validity, and issues with establishing causality. Next, we discuss each issue in terms of how it relates to statistical learning by focusing on several research phenomena that have been examined in some detail. In the final section of this paper, we offer suggestions for ways to address the theory crisis in statistical learning research to move the field forward in fruitful directions.

Challenges facing SL research: lack of robust phenomena

Eronen and Bringmann³⁰ contended that there is a lack of robust phenomena that can be used to constrain and formulate psychological theories. To understand this problem fully, it is necessary to describe the difference between data, phenomena, and theory^29,30 (see Fig. 1). Data are the raw measurements collected in psychology experiments (i.e., measures of behavior or neurophysiology). Phenomena are a description of the effects compiled across the observed data; an example in psychology might be the Stroop effect^33,34 or implicit bias/stereotype effect³⁵, which are descriptions of consistently occurring patterns within data. Finally, theories are created to explain the phenomena; and yet, phenomena also constrain theories³⁰. It is this last point that Eronen and Bringmann³⁰ claimed has received little attention in psychology. Without the existence of robust—i.e., consistent and stable—phenomena, then theories are on shaky ground.

**Fig. 1: The theory of evolution as a model for good theory building.**

Is there a lack of robust phenomena in statistical learning research? What are the relevant phenomena in the first place, and what is the metric for evaluating whether they are robust? Robustness of a phenomenon is at least partly related to the notion of replicability; if a particular pattern of data is not observed consistently across studies, then it is not a robust finding. However, as pointed out by Eronen and Bringmann, it is just as important that for a phenomenon to be robust, it should be “verifiable and detectable in several independent ways and not dependent on a specific theoretical framework or observation method”^30,36,37 (p. 780).

Based on this conceptualization of robustness, we suggest that the overall phenomenon of statistical learning itself, that is, the general finding that people are sensitive to patterns in the environment under incidental conditions, is robust, demonstrated across many studies in a multitude of situations using various laboratory tasks and stimuli³⁸. While this general phenomenon appears robust, it is also important to delineate a number of additional sub-phenomena that have been examined in the statistical learning literature, including but not limited to: modality effects³⁹, effects of input complexity⁴⁰, age effects⁴¹, species differences⁴², role of attention in learning⁴³, competition between executive functions and statistical learning⁴⁴, the contribution of statistical learning to language acquisition¹³, and atypical learning in developmental disorders⁴⁵ (see also⁷, for a review of some of these phenomena). In subsequent sections we provide an evaluation of the robustness of several of these key phenomena.

The point is that a good theoretical framework of statistical learning should not only encompass the general phenomenon of statistical learning itself, but also these additional sub-phenomena (e.g., the presence of modality effects)—but only those that are considered robust. If a phenomenon is not a robust one, then it may be counter-productive to devise theoretical frameworks to explain it. Attempting to devise theories to explain phenomena that are not robust could lead to the unfortunate situation where “correct” theories are rejected because they fail to account for these (unrobust) phenomena. We return to this point at the end of the paper when discussing recommendations. It is therefore imperative to identify which phenomena appear to be robust and which ones need additional empirical support or testing. We discuss the robustness of several such phenomena, in Section “Evaluating Statistical Learning Phenomena”, below.

Challenges facing SL research: issues with construct validity

The second challenge facing the psychological sciences, highlighted by Eronen and Bringmann³⁰ concerns issues with construct validity. A simple definition of construct validity is that a measurement measures what it is intended to measure⁴⁶. Whereas reliability can be easily quantified and tested, construct validity is more nebulous. As such, psychological scientists typically report little evidence for validity, instead focusing on more concrete psychometric properties such as reliability^47,48. For instance, statistical learning researchers have made strides to improve the reliability of the measurements used^49,50,51,52. This is an important and necessary step. However, while issues of validity have been mentioned in the statistical learning literature, including the use of vague or circular definitions for statistical learning and/or the use of different terms and a multitude of ways to measure it^9,11,14,53, no formal attempts at establishing construct validity have been made.

A further challenge to assessing the validity of measures in statistical learning research is that different types of methods have been used to assess statistical learning (see refs. ^11,53), including the following tasks: artificial grammar learning (AGL)⁵, serial reaction time (SRT)⁵⁴, Hebb repetition⁵⁵, triplet segmentation or embedded pattern tasks⁴, cross-situational learning⁵⁶, and contextual cueing⁵⁷. Not only this, but even the same paradigm (e.g., AGL) can vary in terms of the type of input used or how learning is measured. The challenge with using such an array of task methodologies is that, while all of them on the face of it appear to measure statistical learning, each one may emphasize a different type or aspect of such learning, and potentially be supported by different underlying neurocognitive mechanisms.

As pointed out by Eronen and Bringmann³⁰, weak construct validity impacts robustness of phenomena. Because phenomena depend upon inferences made over data, if those data are not well-understood or well-validated, then the phenomena themselves are unlikely to be robust.

Challenges facing SL research: issues with establishing causality

The final challenge raised by Eronen and Bringmann³⁰ is that determining causality among psychological variables is extremely challenging. As they note, good theories should “capture causal relationships between psychological variables” (p.783). However, doing so is extremely difficult⁵⁸. While it is well-understood how to establish causality by manipulating external, non-psychological variables using sound experimental design (e.g.,⁵⁹ psychological theories need to understand the causal relationships between psychological variables, which are much more difficult to directly observe and manipulate. Eronen and Bringmann³⁰ refer to this as the “fat-handed” problem: to manipulate a psychological variable requires manipulating an external variable, but the manipulated external variable might be affecting other psychological processes/variables instead of or in addition to the specific psychological variable of interest.

To take an example from statistical learning research, one ongoing question is to what extent statistical learning depends upon attention (e.g.,^7,60). One common way to manipulate participants’ attention is to use task instructions to direct selective attention to some stimuli but not others and to test if statistical learning still occurs for the unattended stimuli (e.g.,⁴³). While such an approach is commendable and logical, the question is whether other psychological variables apart from attention may also be affected. For instance, it is possible that variables such as motivation, confidence, and/or effort may also be affected by the task instructions, and if so, it is difficult to know to what extent attention itself (or lack thereof) causally affected learning. Note that the problem of causality is exacerbated by the fact that different statistical learning tasks may depend on different underlying cognitive or neural processes. In sum, because psychological variables can only be measured indirectly, it is hard to verify which variables have specifically been changed and manipulated in any given paradigm.

Evaluating statistical learning phenomena

As we have seen, the challenges facing statistical learning research make it difficult to formulate accurate and reliable theories. To further illustrate how these issues pertain to statistical learning research, we focus on four phenomena that have been examined to varying degrees in the research literature: modality effects; age effects; species differences; and atypical statistical learning in developmental disorders, such as dyslexia. For each we consider (1) the robustness of the phenomena, (2) potential issues with construct validity, and (3) challenges with establishing causality. With this approach, we are advocating a “bottom-up” stance for theory-building. That is, before creating theories to explain particular phenomena, it is necessary to have a solid understanding of the robustness of the phenomena themselves, which includes assessing their validity and potential causal relationships between psychological variables.

Modality effects

Statistical learning has been observed across a range of scenarios, from auditory⁶¹ and visual scene analysis¹⁰ to language processing⁶² and motor learning^63,64. However, comparing across modalities has indicated that the conditions that optimize statistical learning vary depending on both modality (e.g., visual vs. auditory) and domain (e.g., linguistic vs. nonlinguistic)^{39,65,66,67,68,69}, for a summary, also see ref. ⁷⁰. These modality-specific representations may exist alongside modality-independent representations and domain-general mechanisms^2,71, although this remains unclear⁷².

Robustness

A key theory² of statistical learning posits that statistical learning exists across modalities but has modality-specific constraints. This is supported by consistent evidence for differences between modalities; for example, performance in the auditory modality is typically superior to the visual modality for the serial presentation of materials^39,67, especially at faster rates^65,66. On the other hand, the auditory advantage seems to disappear when the visual stimuli are simultaneously presented^65,67. There are also asymmetries across modalities for the same domain, for example, visual tasks show a non-linguistic advantage while auditory tasks show a linguistic advantage⁷³. Furthermore, modalities appear to operate with relative independence, showing limited transfer⁷⁴ (though also see ref. ⁷⁵ for a possible exception and Milne, Wilson & Christansen⁷⁰ for further discussion). Additionally, there is a lack of correlation across modalities^52,76,77 as well as different developmental trajectories for statistical learning in different modalities⁷⁸. The presence of modality differences across a broad range of tasks, stimuli and research groups speaks to the robustness of this phenomenon. However, the same theory² suggests the presence of a unitary general statistical learning mechanism, such as the hippocampus, yet the existence and nature of such a mechanism is still unclear (see Bogaerts et al.⁷² for a discussion on whether a “good statistical learner” exists).

Construct validity

Modality effects may be greatly influenced by construct validity. For example, what is the best way to measure statistical learning in each modality to provide a “fair” comparison between them? That is, how can we make methods comparable across modalities, while recognizing that there are some fundamental differences across sensory systems? At a minimum, discriminability of the individual items should be comparable, as Conway & Christiansen³⁹ ensured in a pre-training phase before exposure to the statistical input stream. Even when such methodological controls are used to equate stimuli across modalities, it is difficult to isolate statistical learning itself from other cognitive processes. For example, if modality differences are observed, is it due to statistical learning differences, or the role of working memory^79,80,81 or attention mechanisms across modalities (e.g., object-based attention⁸²)? Alternatively, many participants impose a linguistic code on visual and auditory stimuli during statistical learning tasks⁶⁸, which makes it difficult to isolate pure auditory and visual learning. Relatedly, if participants are using explicit strategies during training and testing, these could both mask or exacerbate differences in modalities; this may particularly be the case for commonly used two-alternative forced-choice tasks^6,72. Finally, many publications comparing across modalities study individual differences; therefore ensuring the retest-reliability of the methods, and that they capture sufficient variance, is especially critical⁵¹.

Causality

If modality effects exist, we also need to establish what is driving them to inform any theoretical framework. Establishing causality relates closely to the issue of construct validity. While construct validity asks us to ensure we are measuring what we think we are measuring, delineating the related perceptual factors, cognitive functions and potential strategies will also inform us on how the differences emerge and their relationship with statistical learning. In this process, we should also consider the fundamental reason why statistical learning exists. In most cases, this is to influence the way we interact with the world, be that in terms of the immediate allocation of cognitive resources (e.g.,⁸³) or the long-term acquisition of speech regularities¹⁵. Better appreciating why the brain may have adapted to use statistics in a given modality could help to understand the differences or similarities in the causal relationships between statistical learning and other psychological variables.

Age effects

There are three ideas that have been advanced regarding how statistical learning abilities may interact with age. First, it has been suggested that statistical learning is age-invariant, with some studies showing that learning was unaffected by age^18,41,84,85. Second, like many other cognitive abilities, statistical learning might improve with age^{86,87,88,89,90,91} (however, such an effect may be due to changes in other cognitive abilities such as working memory, which are known to improve with age⁹² : see “Causality” below). Third, given the importance of statistical learning in language acquisition, it has been argued that infants and children show better statistical learning compared to adults. Infancy and early childhood are where the majority of language learning typically takes place, with some suggestion that language learning abilities decline with age^93,94,95.

Robustness

Longitudinal studies are especially important for understanding potential age effects in statistical learning; indeed, a recent longitudinal study provides support for better statistical learning in younger compared to older children between the ages of 7 and 14⁹⁶. However, there is at least some evidence supporting all three possibilities for how statistical learning and age interact (age-invariance, childhood superiority, and adult superiority), effectively weakening the robustness of all of them. Complicating matters further, there is also evidence that any age-related changes in statistical learning may be influenced by modality effects: Raviv and Arnon⁴¹ found improvements in visual statistical learning between 5- and 12-year-olds, but no changes in statistical learning in the auditory domain. However, in a similar study comparing visual statistical learning with auditory statistical learning of non-linguistic stimuli, improvements in both visual and auditory statistical learning were found between 5- and 12-year-olds⁹⁷. This suggests that any modality-specific differences may be further complicated by differences between stimuli, both of which may affect the conclusions that can be drawn from the literature regarding the developmental trajectory of statistical learning. In sum, the nature of developmental changes across age is not a robust phenomenon, further complicated by the apparent influence of modality- and stimulus-specific effects on the development of statistical learning. On the other hand, there is good evidence that humans of varying ages (infants through adults) are all capable of statistical learning, so the presence of statistical learning across ages appears to be a robust finding.

One critical methodological consideration that specifically impacts our ability to determine changes in statistical learning across development relates to the “more room to improve” effect⁹⁸. This refers to the difficulties that occur when trying to compare different age groups who have different baseline reaction times. While high variability between age groups is often controlled for by using standardized scores, in learning experiments specifically, this variability is actually a critical aspect of the learning process. Specifically, individuals who show greater learning effects typically show greater variability relative to poorer learners. Therefore, using standardized scores may obscure key learning differences, rendering comparisons between age groups uninformative. More recently, efforts have been made to overcome such challenges, for example, by controlling for average speed when using reaction time-based measures of learning⁹⁸.

Construct validity

There are issues relating to the construct validity of statistical learning that are particularly important to consider when measuring people of different ages. While statistical learning in infancy was originally measured using preferential-looking paradigms – which offer an implicit method of measuring learning when the participants are not able to provide explicit judgements—when measuring learning in adults and children, learning is typically evidenced through above chance performance in a grammaticality judgement or familiarity task. In these tasks, participants are asked to provide explicit judgements on whether a sequence fits or breaks a pattern to which they have previously been exposed. Because these tasks measure learning directly by asking participants to classify sequences (often based on ‘gut instinct’), performance may be affected by additional cognitive abilities, such as understanding the task instructions and decision-making skills, which are known to improve across development⁹⁹. As an example, “yes” biases—in this case a preference for categorizing sequences as grammatical—are commonly reported in AGL tasks with children^100,101, which may suggest that children have a lower threshold for classifying sequences as grammatical, that they pick up and use irrelevant features of the sequences to judge their grammaticality, or that they simply do not understand how to complete the task. In any case, it is likely that such explicit measures of statistical learning may underestimate children’s abilities, which makes comparing statistical learning across development using these tasks difficult.

More recently, alternative tasks have been developed that assess statistical learning indirectly, by measuring other variables that are facilitated by statistical learning. For example, SRT tasks^102,103, serial recall tasks^104,105,106, and tasks measuring neural entrainment¹⁰⁷ can assess statistical learning without requiring explicit reflection on what has been learned. Indirect measures of learning also typically provide additional benefits over direct measures in that they can measure learning over the course of the task, while learning is taking place. Therefore, indirect measures of learning may provide a useful method of comparing statistical learning abilities across age, by avoiding the requirement for more explicit processes which may not accurately represent children’s performance. Additionally, these tasks provide information about the trajectory of learning across the task, which could reveal less obvious differences in learning patterns between children and adults.

Causality

As we have discussed, the developmental trajectory of statistical learning, from infancy to adulthood, is already somewhat unclear, both due to a lack of robust age effects and the issues surrounding the validity of the experimental measures used. These issues are compounded by (or possibly caused by, at least in part) challenges with understanding causal relationships between different psychological variables. Aside from statistical learning, myriad cognitive processes change and develop as we age, including attention, memory, executive functions, and language. It is very likely that these abilities contribute to performance on statistical learning tasks, and each of these, and their interactions, could impact the results of developmental studies. While establishing the causal relationships between different psychological processes is a challenge for all statistical learning research, this issue is compounded in developmental research, due to the co-development of so many other aspects of cognition. However, examining changes in other statistical learning phenomena across development using longitudinal studies may actually aid attempts to establish causality. By measuring these other related cognitive functions alongside statistical learning at various timepoints in development, we may be better placed to draw conclusions regarding which aspects of cognition – or other environmental factors such as education – influence the development of statistical learning.

Species differences

Over the last two decades, much comparative research has investigated the extent to which statistical learning abilities might be shared by species other than humans. This research has used a wide range of methods (see Construct Validity, below), but here we will primarily focus on those that assess learning under incidental conditions, particularly in nonhuman primates, as opposed to explicit, operant training (which is common in birds and rodents). These approaches have been used to demonstrate statistical learning abilities in a wide range of primates (tamarins^108,109,110, marmosets^21,111, squirrel monkeys¹¹², rhesus macaques^{68,111,113,114}, baboons^115,116,117, and chimpanzees^118,119). Beyond the general existence of statistical learning in nonhuman animals, considerable amounts of comparative research have focused on the specific types of statistical regularities that animals are able to learn, with some studies suggesting there may be differences across species.

Robustness

The general phenomenon that statistical learning is conserved in other primates (and likely more broadly in the animal kingdom), appears to be highly robust^42,120. However, while all animals tested learn relatively simple ‘adjacent dependencies’ (where stimuli co-occur immediately in space or time), ‘nonadjacent dependencies’ (in which intervening elements separate these stimuli), may prove more difficult to learn (for a review, see¹²¹). Moreover, more complex classes of stimuli (e.g., governed by supra-regular grammars, e.g.,¹⁰⁸) have typically proved too complex for nonhuman primates to learn, at least under incidental conditions^108,122 (although for a successful demonstration using operant approaches, see¹²³). Therefore, the central phenomenon, that core statistical learning abilities exist across the primate taxa (and likely more broadly in the animal kingdom), appears to be highly robust. However, as the complexity of these statistical relationships increases, species differences become pronounced. Here, it is important to note that these results may not represent differences in statistical learning per se, but may instead relate to other species differences (e.g., perception, attention, motivation, etc.) as will be discussed in ‘Causality’, below.

Construct validity

A central challenge of studying animal cognition is the requirement that tasks are both feasible in nonhuman animals and also accurately measure the constructs of interest, in this case, statistical learning. Early research in nonhuman primates made use of paradigms originally developed to test preverbal infants, such as preferential looking or habituation/dishabituation paradigms measuring head turn responses^{21,108,109,110,112}, or equivalent responses using eye-tracking^68,111,114. While these approaches allow reasonable comparisons of monkey and infant data, most adult human statistical learning research uses alternative methods (e.g., grammaticality judgement tasks), which are not feasible in nonhuman animals. A second popular method for testing nonhuman primate statistical learning is the use of SRT tasks, which have been combined with a range of different artificial grammars (e.g.,^113,115,116), in which animals make a series of rapid responses with a joystick or on a touchscreen. These tasks can allow direct comparisons with humans (as in ref. ¹¹⁶). However, SRT tasks may rely on somewhat different, more procedural learning processes compared to preferential looking paradigms, raising further challenges in assessing construct validity and particularly to drawing comparisons across studies and experimental designs. Additionally, adaptations to these designs allow the investigation of how primates use statistical regularities to encode sequences of stimuli into increasingly large ‘chunks’, over the course of training^124,125. Given this range of approaches, many cross-species comparisons (either within a single comparative study (e.g.,^111,114), or those aggregating across multiple studies) are based on disparate methods, which may, at least to some extent, measure different constructs. In cases where similar patterns of learning are observed, despite such methodological differences, it may be reasonable to conclude that not only do those similarities in statistical learning exist, but that they are also robust even to these methodological differences. However, when differences are observed, this leads to difficulties in concluding whether it is animals’ statistical learning abilities that differ, or whether different methods and approaches used across species may actually measure somewhat different psychological constructs.

Causality

The clearest overarching conclusion to be taken from the comparative literature on statistical learning is that many species are capable of learning at least relatively simple statistical relationships. However, when statistical regularities become more complex, species differences appear to emerge. This raises the challenging question of determining whether these differences result from more limited statistical learning abilities in nonhuman animals (as has often been concluded) or whether other differences in perception, motivation, attention, or other psychological variables, may contribute to these effects. One interesting demonstration of this may be the recent move to testing nonhuman primates using more operant training approaches. Using traditional, incidental learning approaches monkeys have not been shown to learn complex, supra-regular grammars. However, using an explicit, operant training approach in which rhesus macaques were rewarded for correct responses over many thousands of trials, monkeys were capable of learning these types of grammars^123,126. It is highly likely that these tasks recruit different learning systems, but these task differences also introduce many other differences, including perceptual differences as well as different levels of motivation and attention, due to the promise of reward for correctly completed trials. This highlights the interplay between different psychological variables and the challenges of isolating statistical learning from other factors. While this specific example applies to learning in nonhuman primates, the same issues of perception, motivation, attention, etc., are likely relevant to human studies of statistical learning, and represent one of the major challenges to theory building in statistical learning.

Atypical statistical learning in developmental dyslexia

Finally, there has been much interest in examining whether statistical learning is atypical in different developmental disorders related to language and communication. Because statistical learning appears to be crucial for learning spoken and written language in typical development¹¹, it is possible that certain language and communication disorders might arise from atypical statistical learning¹²⁷. Developmental dyslexia is one such disorder that has been examined through the lens of statistical learning (e.g.,^{128,129,130,131}). Note that while we focus here on developmental dyslexia, some of the same questions about robustness, validity, and causality are likely relevant for many other developmental disorders.

Robustness

Based on several meta-analytic reviews, which can be useful for systematically assessing robustness across studies (e.g.,³⁸), the extent of a statistical learning impairment in dyslexia remains unclear. On the one hand, Lum et al.’s¹³² systematic review showed evidence of a statistical learning impairment in those with dyslexia, based on studies using SRT tasks in adults and children. Likewise, van Witteloostuijn et al.’s¹³³ meta-analysis also found evidence of impaired learning in dyslexia, based on studies using AGL paradigms. However, their analysis suggested the presence of publication bias, which makes it harder to gauge the robustness of the relationship between statistical learning and dyslexia. Similarly, Schmalz et al.’s¹³⁴ systematic review of statistical learning in dyslexia using both SRT and AGL tasks also noted the presence of publication bias and concluded that there is “insufficient high-quality data to draw conclusions.” Finally, Singh and Conway¹³⁵ reviewed recent studies that were not included in the three prior meta-analytic and systematic reviews and concluded that there was some evidence for a statistical learning impairment in adults with dyslexia (with 11 of 19 studies showing impairment), but that the evidence for an impairment in children with dyslexia was weaker (with only 6 of 14 studies showing impairment). Singh and Conway¹³⁵ proposed that some of the inconsistent findings could be due to heterogeneity in the tasks and methods used to assess statistical learning, the heterogeneity of dyslexia itself, and the role of publication bias. Thus, the finding of impaired statistical learning in developmental dyslexia does not appear to be a robust phenomenon.

Construct validity

As mentioned earlier, statistical learning research uses a variety of task methodologies, and it is currently unclear to what extent they measure the same construct. This in turn can lead to discrepancies across studies intending to measure statistical learning in dyslexia. For instance, if two studies examine statistical learning in dyslexia using different tasks, and the findings are inconsistent with each other, it is unclear whether such differences are due to differences in the use of the tasks or some other factor^11,12,135. The solution, of course, is not to focus on only one statistical learning task, but instead to carry out systematic investigations of dyslexia using a variety of tasks, chosen based on theoretical reasons⁵³, and used across multiple studies. And as is true for the other phenomena reviewed above, construct validity requires increasing our understanding of how different tasks engage different brain systems and therefore different underlying cognitive processes (e.g., the SRT task is heavily dependent on the basal ganglia^136,137, while the triplet segmentation task instead appears to rely on a combination of neocortical sensory networks and the hippocampus^2,7).

Causality

One issue for research investigating statistical learning in dyslexia is that the findings are essentially correlational in nature: observing a statistical learning deficit in individuals with dyslexia compared to a control group is consistent with the idea that the learning deficit caused reading problems, but it is not the only possible conclusion. Some other cognitive function (i.e., a “third variable” such as attention or phonological processing) may impact both statistical learning performance and reading ability. Moreover, statistical learning impairments may not cause developmental dyslexia but rather result from it. To establish the causal nature of a statistical learning-dyslexia link would require experimentally influencing statistical learning ability—through the use of neurostimulation (e.g., TMS, tDCS^138,139 or possibly dual-task interference tasks^140,141)—and then observing how such manipulations impact reading ability (i.e., inducing a “virtual lesion”¹⁴²). However, such attempts run into Eronen and Bringmann’s³⁰ aforementioned “fat-handed” problem: where direct attempts to experimentally manipulate statistical learning ability may instead affect related cognitive processes such as attention or working memory. Further complicating attempts at causality is the heterogeneity within dyslexia in terms of variability in reading¹⁴³ and spelling ability profiles^30,144. There are also often co-occurring conditions with developmental dyslexia (e.g., attention, executive function, and motor skills^{143,145,146,147}). The heterogeneity and possible comorbidity of developmental dyslexia presents a further challenge to establishing causality, because it is possible that even if statistical learning has a causal influence on reading ability and disability, it may only be the case for certain individuals.

Discussion and recommendations

In sum, the statistical learning phenomena reviewed here vary in their robustness and have similar though also unique issues related to construct validity and causality. Some issues are generalizable to much of statistical learning research (e.g., separating statistical learning abilities from other psychological processes such as memory, attention and motivation), whereas others are more specific to each area (e.g., unavoidable methodological differences when testing different ages or species). As discussed above, the general phenomenon of statistical learning—incidentally acquiring information from environmental stimuli—is highly robust. Based on the present evidence, modality effects, specifically the auditory-serial and visual-spatial advantages for learning, also appears to be a relatively robust finding. Likewise, the presence of statistical learning across age and across nonhuman species also appear to be robust, though it is likely that not all species learn all types of patterns equally well. The extent to which statistical learning changes with age, as well as to what extent statistical learning is impaired in developmental dyslexia, do not seem to be robust phenomena. For all four areas of research, crucial questions about construct validity and causality must be addressed to ensure an accurate understanding of the phenomena in question. While we focused on these four phenomena as ones that have received substantial coverage in the research literature, we acknowledge that many other relevant statistical learning phenomena also exist, and may face similar challenges related to robustness, validity, and causality.

While each statistical learning research phenomenon varies in the extent and nature of these challenges, and thus may require unique solutions, we believe the following recommendations are general enough to broadly deal with many of the most important and common issues facing statistical learning research. We also note that some of these recommendations may be useful for other areas of psychological research that face similar challenges.

There may be value in taking a “bottom-up” approach to theory building

Eronen and Bringmann³⁰ (see also^{29,148,149,150}) suggested that psychology needs more “phenomenon-driven” research, with the focus on identifying what are the robust phenomena, so that we know what findings theories must encapsulate. Discovering new (and robust) phenomena will help us constrain the possible theory space so that we avoid attempting to develop theoretical frameworks to explain findings that turn out to not be robust. Furthermore, as our understanding of robust phenomena increases, this allows one to see abstract patterns across them that would not be otherwise possible without such knowledge. Note, a bottom-up approach to theory-building does not preclude a top-down approach to experimental design or hypothesis-driven research; in fact, these are complementary, not antagonistic, approaches.

Relevant concepts need to be clearly and transparently defined

As mentioned earlier, statistical learning is sometimes defined differently across studies and assessed in different ways^11,53, potentially causing serious issues whereby a single term might be used to refer to several quite different cognitive or neural processes¹⁴. Until we have a clear and accurate understanding of relevant constructs and how best to measure them, it will be impossible to know what the robust phenomena are and how constructs causally relate to one another. Thus, while much recent focus has been on the reliability of statistical learning measures^50,51, and rightly so, the issue of construct validity is also critical to consider. Relatedly, it is important to point out that using different measures for statistical learning is not inherently counterproductive (see next recommendation); however, we should be aware that using different methodological approaches for studying statistical learning may mean we are not measuring the same processes.

The field will benefit from a diversity of statistical learning measures

While it is important to be mindful of methodological differences across statistical learning tasks, a wide range of methods are needed to fully understand the relevant phenomena. As Eronen and Bringmann³⁰ pointed out, robustness is improved if a given phenomenon is observed across multiple methods and measures. Thus, we suggest that future research investigating statistical learning phenomena—even within individual studies—should incorporate multiple measures of statistical learning, to determine the robustness and generalizability of any such findings, rather than focusing on a single task⁹. A more complete understanding of statistical learning requires a deep exploration of how differences in the type of task, nature of the input, individual differences, and other factors affect the learning process and the resulting knowledge that is acquired. Even so, a diversity of measures is only valuable if we know what they are measuring (i.e., again, good construct validity is necessary).

There is value in methods that can be commonly used across different modalities, ages, species, and populations

While we recommend incorporating a diversity of (valid) statistical learning measures, some areas of research will benefit from the use of common methods. For instance, using equivalent methods and measures may be important when attempting to examine potential commonalities and/or differences across modalities, age, species, or populations. As pointed out above, this may be especially true when interpreting observed learning differences (e.g., across age, or species, etc.). If a difference across age or species is observed when using different methods or tasks, then it is not clear if the differences are due to differences in methods or actual differences due to age or species. For instance, some methods such as the SRT paradigm have been used successfully across species and across much of the human lifespan^94,116; similarly, neural measures can also be adapted across different ages, species, in addition to acquiring behavioral data.

Statistical learning should not be studied in a vacuum

To better understand statistical learning phenomena, it is also important to measure as many other relevant processes and variables as possible to see what other factors may be associated with statistical learning⁹. For instance, several lines of research suggest that there may be a competitive relationship between executive functions and statistical learning^71,141. Consequently, if a difference in statistical learning is uncovered between species, or between ages, it is important to determine if such differences are due to statistical learning per se or to differences in executive function, or some other psychological variable that may also relate to statistical learning, such as motivation or attention. To solve this issue requires both incorporating well-validated measures not just of statistical learning but of other relevant psychological constructs and considering the interactions between them.

Interactions among phenomena should be explored

For convenience, the four sample phenomena evaluated above were presented largely independent of one another; however, in reality, the various phenomena interact in potentially complex ways. For instance, as noted above, there is some suggestion that the developmental trajectory of statistical learning is affected by modality, with improvements in visual but not auditory statistical learning being found across age. Likewise, modality effects observed in humans may or may not generalize to nonhumans⁶⁸. There is also evidence that statistical learning impairments in developmental dyslexia may depend at least somewhat on age and possibly sensory modality as well (see¹³⁵). Thus, to fully evaluate and understand each phenomenon may require understanding their interactions with one another.

Conclusion

Despite the surge of interest in statistical learning, there are a number of critical challenges facing this area of research^9,11,12,13. We have argued that these challenges are similar to those facing the psychological sciences more generally. Relying on recent conceptual advances in contending with the theory crisis³⁰, we have evaluated a sample of statistical learning research areas in terms of the robustness of each phenomenon, issues with construct validity, and difficulties with establishing causality. We have also offered recommendations for how to deal with these issues in a way that we believe will lead to promising areas for theory-building, specifically by focusing on ways to better establish robustness, validity, and causality in statistical learning research.

We hope this analysis and the recommendations contained within will improve future attempts at constructing theories to explain various statistical learning phenomena. Such theories would need to be pitched at both a higher level—i.e., what neural mechanisms and cognitive processes underlie statistical learning itself³² and how they relate to other psychological and cognitive constructs—but also within the embedded sub-phenomena reviewed here—i.e., theories to explain how and why differences exist (or not) across modality, age, species, and developmental disorders. The development of such theories will be crucially dependent upon the continued identification and exploration of the robust phenomena, the refinement of a diverse set of valid measures, and an understanding of the causal relationships relevant to statistical learning.

Data availability

No datasets were generated or analysed during the current study.

References

Aslin, R. N. Statistical learning: a powerful mechanism that operates by mere exposure. WIRES Cogn. Sci. 8, e1373 (2017).
Article Google Scholar
Frost, R., Armstrong, B. C., Siegelman, N. & Christiansen, M. H. Domain generality versus modality specificity: the paradox of statistical learning. Trends Cogn. Sci. 19, 117–125 (2015).
Article PubMed PubMed Central Google Scholar
Saffran, J. R. & Kirkham, N. Z. Infant statistical learning. Annu. Rev. Psychol. 69, 181–203 (2018).
Article PubMed Google Scholar
Saffran, J. R., Aslin, R. N. & Newport, E. L. Statistical learning by 8-month-old infants. Science 274, 1926–1928 (1996).
Article CAS PubMed Google Scholar
Reber, A. S. Implicit learning of artificial grammars. J. Verbal Learn. Verbal Behav. 6, 855–863 (1967).
Article Google Scholar
Christiansen, M. H. Implicit statistical learning: a tale of two literatures. Top. Cogn. Sci. 11, 468–481 (2019).
Article PubMed Google Scholar
Conway, C. M. How does the brain learn environmental structure? Ten core principles for understanding the neurocognitive mechanisms of statistical learning. Neurosci. Biobehav. Rev. 112, 279–299 (2020).
Article PubMed PubMed Central Google Scholar
Sherman, B. E., Graves, K. N. & Turk-Browne, N. B. The prevalence and importance of statistical learning in human cognition and behavior. Curr. Opin. Behav. Sci. 32, 15–20 (2020).
Article PubMed PubMed Central Google Scholar
Frost, R., Armstrong, B. C. & Christiansen, M. H. Statistical learning research: a critical review and possible new directions. Psychol. Bull. 145, 1128–1153 (2019).
Article PubMed Google Scholar
Turk-Browne, N. B., Scholl, B. J., Chun, M. M. & Johnson, M. K. Neural evidence of statistical learning: efficient detection of visual regularities without awareness. J. Cogn. Neurosci. 21, 1934–1945 (2009).
Article PubMed PubMed Central Google Scholar
Arciuli, J. & Conway, C. M. The promise—and challenge—of statistical learning for elucidating atypical language development. Curr. Dir. Psychol. Sci. 27, 492–500 (2018).
Article PubMed PubMed Central Google Scholar
Bogaerts, L., Frost, R. & Christiansen, M. H. Integrating statistical learning into cognitive science. J. Mem. Lang. 115, 104167 (2020).
Article Google Scholar
Siegelman, N. Statistical learning abilities and their relation to language. Lang. Linguist. Compass 14, e12365 (2020).
Article Google Scholar
Willingham, D. B. & Preuss, L. The death of implicit memory. Psyche. 2, 1–10 (1995).
Google Scholar
Erickson, L. C. & Thiessen, E. D. Statistical learning of language: theory, validity, and predictions of a statistical learning account of language acquisition. Dev. Rev. 37, 66–108 (2015).
Article Google Scholar
Thiessen, E. D., Girard, S. & Erickson, L. C. Statistical learning and the critical period: how a continuous learning mechanism can give rise to discontinuous learning. WIRES Cogn. Sci. 7, 276–288 (2016).
Article Google Scholar
Thiessen, E. D. & Erickson, L. C. Beyond word segmentation: a two- process account of statistical learning. Curr. Dir. Psychol. Sci. 22, 239–243 (2013).
Article Google Scholar
Thiessen, E. D., Kronstein, A. T. & Hufnagle, D. G. The extraction and integration framework: a two-process account of statistical learning. Psychol. Bull. 139, 792–814 (2013).
Article PubMed Google Scholar
Perruchet, P. What mechanisms underlie implicit statistical learning? Transitional probabilities versus chunks in language learning. Top. Cogn. Sci. 11, 520–535 (2019).
Article PubMed Google Scholar
Perruchet, P. & Pacton, S. Implicit learning and statistical learning: one phenomenon, two approaches. Trends Cogn. Sci. 10, 233–238 (2006).
Article PubMed Google Scholar
Reber, S. A. et al. Common marmosets are sensitive to simple dependencies at variable distances in an artificial grammar. Evol. Hum. Behav. 40, 214–221 (2019).
Article PubMed PubMed Central Google Scholar
Arciuli, J. The multi-component nature of statistical learning. Philos. Trans. R. Soc. B Biol. Sci. 372, 20160058 (2017).
Article Google Scholar
Daltrozzo, J. & Conway, C. M. Neurocognitive mechanisms of statistical-sequential learning: what do event-related potentials tell us? Front. Hum. Neurosci. 8, 437 (2014).
Dehaene, S., Meyniel, F., Wacongne, C., Wang, L. & Pallier, C. The neural representation of sequences: from transition probabilities to algebraic patterns and linguistic trees. Neuron 88, 2–19 (2015).
Article CAS PubMed Google Scholar
Savalia, T., Shukla, A. & Bapi, R. S. A unified theoretical framework for cognitive sequencing. Front. Psychol. 7, 1821 (2016).
Gigerenzer, G. Personal reflections on theory and psychology. Theory Psychol. 20, 733–743 (2010).
Article Google Scholar
Muthukrishna, M. & Henrich, J. A problem in theory. Nat. Hum. Behav. 3, 221–229 (2019).
Article PubMed Google Scholar
Oberauer, K. & Lewandowsky, S. Addressing the theory crisis in psychology. Psychon. Bull. Rev. 26, 1596–1618 (2019).
Article PubMed Google Scholar
Borsboom, D., van der Maas, H. L. J., Dalege, J., Kievit, R. A. & Haig, B. D. Theory construction methodology: a practical framework for building theories in psychology. Perspect. Psychol. Sci. 16, 756–766 (2021).
Article PubMed Google Scholar
Eronen, M. I. & Bringmann, L. F. The theory crisis in psychology: how to move forward. Perspect. Psychol. Sci. 16, 779–788 (2021).
Article PubMed PubMed Central Google Scholar
Marr, D. Vision: A computational investigation into the human representation and processing of visual information. https://doi.org/10.7551/mitpress/9780262514620.001.0001 (The MIT Press, 2010).
Conway, C. M., Janacsek, K., Buffington, J. & Ullman, M. T. The what, how, and where of statistical learning. Nature Reviews Neuroscience. (under review)..
MacLeod, C. M. Half a century of research on the Stroop effect: an integrative review. Psychol. Bull. 109, 163–203 (1991).
Article CAS PubMed Google Scholar
Stroop, J. R. Studies of interference in serial verbal reactions. J. Exp. Psychol. 18, 643–662 (1935).
Article Google Scholar
Greenwald, A. G. & Banaji, M. R. Implicit social cognition: Attitudes, self-esteem, and stereotypes. Psychol. Rev. 102, 4–27 (1995).
Article CAS PubMed Google Scholar
Eronen, M. I. Robustness and reality. Synthese 192, 3961–3977 (2015).
Article Google Scholar
Eronen, M. I. Robust realism for the life sciences. Synthese 196, 2341–2354 (2019).
Article Google Scholar
Isbilen, E. S. & Christiansen, M. H. Statistical learning of language: a meta-analysis into 25 years of research. Cogn. Sci. 46, e13198 (2022).
Article PubMed Google Scholar
Conway, C. M. & Christiansen, M. H. Modality-constrained statistical learning of tactile, visual, and auditory sequences. J. Exp. Psychol. Learn. Mem. Cogn. 31, 24–39 (2005).
Article PubMed Google Scholar
Pothos, E. An entropy model for artificial grammar learning. Front. Psychol. 1, 16 (2010).
Raviv, L. & Arnon, I. The developmental trajectory of children’s auditory and visual statistical learning abilities: modality-based differences in the effect of age. Dev. Sci. 21, e12593 (2018).
Article PubMed Google Scholar
Santolin, C. & Saffran, J. R. Constraints on statistical learning across species. Trends Cogn. Sci. 22, 52–63 (2018).
Article PubMed Google Scholar
Turk-Browne, N. B., Jungé, J. A. & Scholl, B. J. The automaticity of visual statistical learning. J. Exp. Psychol. Gen. 134, 552–564 (2005).
Article PubMed Google Scholar
Pedraza, F. et al. Evidence for a competitive relationship between executive functions and statistical learning. npj Sci. Learn. 9, 30 (2024).
Article PubMed PubMed Central Google Scholar
Saffran, J. R. Statistical learning as a window into developmental disabilities. J. Neurodev. Disord. 10, 35 (2018).
Article PubMed PubMed Central Google Scholar
Borsboom, D., Mellenbergh, G. J. & van Heerden, J. The concept of validity. Psychol. Rev. 111, 1061–1071 (2004).
Article PubMed Google Scholar
Flake, J. K., Pek, J. & Hehman, E. Construct validation in social and personality research: current practice and recommendations. Soc. Psychol. Personal. Sci. 8, 370–378 (2017).
Article Google Scholar
Zumbo, B. D. & Chan, E. K. H. Setting the stage for validity and validation in social, behavioral, and health sciences: trends in validation practices. In Validity and Validation in Social, Behavioral, and Health Sciences (eds Zumbo, B. D. & Chan, E. K. H.) 3–8 https://doi.org/10.1007/978-3-319-07794-9_1 (Springer International Publishing, Cham, 2014).
Arnon, I. Do current statistical learning tasks capture stable individual differences in children? An investigation of task reliability across modality. Behav. Res. 52, 68–81 (2020).
Article Google Scholar
Siegelman, N., Bogaerts, L., Christiansen, M. H. & Frost, R. Towards a theory of individual differences in statistical learning. Philos. Trans. R. Soc. Lond. B Biol. Sci. 372, 20160059 (2017).
Article PubMed PubMed Central Google Scholar
Siegelman, N., Bogaerts, L. & Frost, R. Measuring individual differences in statistical learning: Current pitfalls and possible solutions. Behav. Res. 49, 418–432 (2017).
Article Google Scholar
Siegelman, N. & Frost, R. Statistical learning as an individual ability: theoretical perspectives and empirical evidence. J. Mem. Lang. 81, 105–120 (2015).
Article PubMed PubMed Central Google Scholar
Bogaerts, L., Siegelman, N. & Frost, R. Statistical learning and language impairments: toward more precise theoretical accounts. Perspect. Psychol. Sci. 16, 319–337 (2021).
Article PubMed Google Scholar
Nissen, M. J. & Bullemer, P. Attentional requirements of learning: Evidence from performance measures. Cogn. Psychol. 19, 1–32 (1987).
Article Google Scholar
Hebb, D. Distinctive features of learning in the higher animal. in Brain Mechanisms of Learning 37–46 (Oxford, UK: Blackwell, 1961).
Yu, C. & Smith, L. B. Rapid word learning under uncertainty via cross-situational statistics. Psychol. Sci. 18, 414–420 (2007).
Article PubMed Google Scholar
Chun, M. M. & Jiang, Y. Contextual cueing: implicit learning and memory of visual context guides spatial attention. Cogn. Psychol. 36, 28–71 (1998).
Article CAS PubMed Google Scholar
Eronen, M. I. Causal discovery and the problem of psychological interventions. New Ideas Psychol. 59, 100785 (2020).
Article Google Scholar
Woodward, J. Methodology, ontology, and interventionism. Synthese 192, 3577–3599 (2015).
Article Google Scholar
Center, E. G., Federmeier, K. D. & Beck, D. M. The brain’s sensitivity to real-world statistical regularity does not require full attention. J. Cogn. Neurosci. 36, 1715–1740 (2024).
Article PubMed Google Scholar
Sohoglu, E. & Chait, M. Detecting and representing predictable structure during auditory scene analysis. eLife 5, e19113 (2016).
Article PubMed PubMed Central Google Scholar
Romberg, A. R. & Saffran, J. R. Statistical learning and language acquisition. WIREs Cogn. Sci. 1, 906–914 (2010).
Article Google Scholar
Nemeth, D., Hallgató, E., Janacsek, K., Sándor, T. & Londe, Z. Perceptual and motor factors of implicit skill learning. NeuroReport 20, 1654–1658 (2009).
Article PubMed Google Scholar
Hallgató, E., Győri-Dani, D., Pekár, J., Janacsek, K. & Nemeth, D. The differential consolidation of perceptual and motor learning in skill acquisition. Cortex 49, 1073–1081 (2013).
Article PubMed Google Scholar
Conway, C. M. & Christiansen, M. H. Seeing and hearing in space and time: effects of modality and presentation rate on implicit statistical learning. Eur. J. Cogn. Psychol. 21, 561–580 (2009).
Article Google Scholar
Emberson, L. L., Conway, C. M. & Christiansen, M. H. Timing is everything: changes in presentation rate have opposite effects on auditory and visual implicit statistical learning. Q. J. Exp. Psychol. 64, 1021–1040 (2011).
Article Google Scholar
Lukics, K. S. & Lukács, Á. Modality, presentation, domain and training effects in statistical learning. Sci. Rep. 12, 20878 (2022).
Article CAS PubMed PubMed Central Google Scholar
Milne, A. E., Petkov, C. I. & Wilson, B. Auditory and visual sequence learning in humans and monkeys using an artificial grammar learning paradigm. Neuroscience 389, 104–117 (2018).
Article CAS PubMed Google Scholar
Walk, A. M. & Conway, C. M. Cross-domain statistical–sequential dependencies are difficult to learn. Front. Psychol. 7, 250 (2016).
Milne, A. E., Wilson, B. & Christiansen, M. Structured sequence learning across sensory modalities in humans and nonhuman primates. Curr. Opin. Behav. Sci. 21, 39–48 (2018).
Article Google Scholar
Vékony, T. et al. Modality-specific and modality-independent neural representations work in concert in predictive processes during sequence learning. Cereb. Cortex 33, 7783–7796 (2023).
Article PubMed PubMed Central Google Scholar
Bogaerts, L., Siegelman, N., Christiansen, M. H. & Frost, R. Is there such a thing as a ‘good statistical learner’?. Trends Cogn. Sci. 26, 25–37 (2022).
Article PubMed Google Scholar
Polyanskaya, L. et al. Intermodality differences in statistical learning: phylogenetic and ontogenetic influences. Ann. N. Y. Acad. Sci. 1511, 191–209 (2022).
Article PubMed Google Scholar
Conway, C. M. & Christiansen, M. H. Statistical learning within and between modalities: pitting abstract against stimulus-specific representations. Psychol. Sci. 17, 905–912 (2006).
Article PubMed Google Scholar
Durrant, S. J., Cairney, S. A. & Lewis, P. A. Cross-modal transfer of statistical information benefits from sleep. Cortex 78, 85–99 (2016).
Article PubMed Google Scholar
Ordin, M., Polyanskaya, L. & Samuel, A. G. An evolutionary account of intermodality differences in statistical learning. Ann. N. Y. Acad. Sci. 1486, 76–89 (2021).
Article PubMed Google Scholar
Zhou, H., van der Ham, S., de Boer, B., Bogaerts, L. & Raviv, L. Modality and stimulus effects on distributional statistical learning: sound vs. sight, time vs. space. J. Mem. Lang. 138, 104531 (2024).
Article Google Scholar
Ren, J. & Wang, M. Development of statistical learning ability across modalities, domains, and languages. J. Exp. Child Psychol. 226, 105570 (2023).
Article PubMed Google Scholar
Baddeley, A. D. & Hitch, G. Working memory. in Psychology of Learning and Motivation (ed Bower, G. H.) 8, 47–89 (Academic Press, 1974).
Lehnert, G. & Zimmer, H. D. Modality and domain specific components in auditory and visual working memory tasks. Cogn. Process 9, 53–61 (2008).
Article PubMed Google Scholar
Zimmer, H. D. Visual and spatial working memory: From boxes to networks. Neurosci. Biobehav. Rev. 32, 1373–1395 (2008).
Article PubMed Google Scholar
Shinn-Cunningham, B. G. Object-based auditory and visual attention. Trends Cogn. Sci. 12, 182–186 (2008).
Article PubMed PubMed Central Google Scholar
Milne, A. E., Zhao, S., Tampakaki, C., Bury, G. & Chait, M. Sustained pupil responses are modulated by predictability of auditory sequences. J. Neurosci. 41, 6116–6127 (2021).
Article CAS PubMed PubMed Central Google Scholar
Jost, E., Conway, C. M., Purdy, J. D., Walk, A. M. & Hendricks, M. A. Exploring the neurodevelopment of visual statistical learning using event-related brain potentials. Brain Res. 1597, 95–107 (2015).
Article CAS PubMed Google Scholar
Saffran, J. R., Newport, E. L., Aslin, R. N., Tunick, R. A. & Barrueco, S. Incidental language learning: Listening (and learning) out of the corner of your ear. Psychol. Sci. 8, 101–105 (1997).
Article Google Scholar
Arciuli, J. & Simpson, I. C. Statistical learning in typically developing children: The role of age and speed of stimulus presentation. Dev. Sci. 14, 464–473 (2011).
Article PubMed Google Scholar
Kirkham, N. Z., Slemmer, J. A. & Johnson, S. P. Visual statistical learning in infancy: evidence for a domain general learning mechanism. Cognition 83, B35–B42 (2002).
Article PubMed Google Scholar
Kirkham, N. Z., Slemmer, J. A., Richardson, D. C. & Johnson, S. P. Location, location, location: development of spatiotemporal sequence learning in infancy. Child Dev. 78, 1559–1571 (2007).
Article PubMed Google Scholar
Lukács, Á & Kemény, F. Development of different forms of skill learning throughout the lifespan. Cogn. Sci. 39, 383–404 (2015).
Article PubMed Google Scholar
Thomas, K. M. et al. Evidence of developmental differences in implicit sequence learning: an fMRI study of children and adults. J. Cogn. Neurosci. 16, 1339–1351 (2004).
Article PubMed Google Scholar
Vaidya, C. J., Huger, M., Howard, D. V. & Howard, J. H. Jr Developmental differences in implicit learning of spatial context. Neuropsychology 21, 497 (2007).
Article PubMed Google Scholar
Gathercole, S. E., Pickering, S. J., Ambridge, B. & Wearing, H. The structure of working memory from 4 to 15 years of age. Dev. Psychol. 40, 177 (2004).
Article PubMed Google Scholar
Birdsong, D. Second Language Acquisition and the Critical Period Hypothesis (Routledge, 1999).
Janacsek, K., Fiser, J. & Nemeth, D. The best time to acquire new skills: age-related differences in implicit sequence learning across the human lifespan. Dev. Sci. 15, 496–505 (2012).
Article PubMed PubMed Central Google Scholar
Newport, E. L. Maturational constraints on language learning. Cogn. Sci. 14, 11–28 (1990).
Article Google Scholar
Tóth-Fáber, E., Farkas, B. C., Harmath-Tánczos, T., Nemeth, D. & Janacsek, K. Longitudinal evidence fordecreasing statistical learning abilities across childhood. https://doi.org/10.31234/osf.io/gj3hq (2024).
Shufaniya, A. & Arnon, I. Statistical learning is not age-invariant during childhood: performance improves with age across modality. Cogn. Sci. 42, 3100–3115 (2018).
Article PubMed Google Scholar
Juhasz, D., Nemeth, D. & Janacsek, K. Is there more room to improve? The lifespan trajectory of procedural learning and its relationship to the between-and within-group differences in average response times. PLoS ONE 14, e0215116 (2019).
Article CAS PubMed PubMed Central Google Scholar
Lammertink, I., Van Witteloostuijn, M., Boersma, P., Wijnen, F. & Rispens, J. Auditory statistical learning in children: novel insights from an online measure. Appl. Psycholinguist. 40, 279–302 (2019).
Article Google Scholar
Lammertink, I., Boersma, P., Wijnen, F. & Rispens, J. Children with developmental language disorder have an auditory verbal statistical learning deficit: evidence from an online measure. Lang. Learn. 70, 137–178 (2020).
Article Google Scholar
van der Lely, H. K., Jones, M. & Marshall, C. R. Who did Buzz see someone? Grammaticality judgement of wh-questions in typically developing children and children with Grammatical-SLI. Lingua 121, 408–422 (2011).
Article PubMed PubMed Central Google Scholar
Jenkins, H. E., Leung, P., Smith, F., Riches, N. & Wilson, B. Assessing processing-based measures of implicit statistical learning: three serial reaction time experiments do not reveal artificial grammar learning. PLoS ONE 19, e0308653 (2024).
Article CAS PubMed PubMed Central Google Scholar
Misyak, J. B., Christiansen, M. H. & Tomblin, J. B. On-line individual differences in statistical learning predict language processing. Front. Psychol. 1, 31 (2010).
Article PubMed PubMed Central Google Scholar
Conway, C. M., Bauernschmidt, A., Huang, S. S. & Pisoni, D. B. Implicit statistical learning in language processing: word predictability is the key. Cognition 114, 356–371 (2010).
Article PubMed Google Scholar
Isbilen, E. S., McCauley, S. M., Kidd, E. & Christiansen, M. H. Statistically induced chunking recall: a memory-based approach to statistical learning. Cogn. Sci. 44, e12848 (2020).
Article PubMed Google Scholar
Kidd, E. et al. Measuring children’s auditory statistical learning via serial recall. J. Exp. Child Psychol. 200, 104964 (2020).
Article PubMed Google Scholar
Moreau, C. N., Joanisse, M. F., Mulgrew, J. & Batterink, L. J. No statistical learning advantage in children over adults: evidence from behaviour and neural entrainment. Dev. Cogn. Neurosci. 57, 101154 (2022).
Article PubMed PubMed Central Google Scholar
Fitch, W. T. & Hauser, M. D. Computational constraints on syntactic processing in a nonhuman primate. Science 303, 377–380 (2004).
Article CAS PubMed Google Scholar
Newport, E. L., Hauser, M. D., Spaepen, G. & Aslin, R. N. Learning at a distance II. Statistical learning of non-adjacent dependencies in a non-human primate. Cogn. Psychol. 49, 85–117 (2004).
Article PubMed Google Scholar
Saffran, J. et al. Grammatical pattern learning by human infants and cotton-top tamarin monkeys. Cognition 107, 479–500 (2008).
Article PubMed Google Scholar
Wilson, B. et al. Auditory artificial grammar learning in macaque and marmoset monkeys. J. Neurosci. 33, 18825–18835 (2013).
Article CAS PubMed PubMed Central Google Scholar
Ravignani, A., Sonnweber, R.-S., Stobbe, N. & Fitch, W. T. Action at a distance: dependency sensitivity in a New World primate. Biol. Lett. 9, 20130852 (2013).
Article PubMed PubMed Central Google Scholar
Heimbauer, L. A., Conway, C. M., Christiansen, M. H., Beran, M. J. & Owren, M. J. Visual artificial grammar learning by rhesus macaques (Macaca mulatta): exploring the role of grammar complexity and sequence length. Anim. Cogn. 21, 267–284 (2018).
Article PubMed Google Scholar
Wilson, B., Smith, K. & Petkov, C. I. Mixed-complexity artificial grammar learning in humans and macaque monkeys: evaluating learning strategies. Eur. J. Neurosci. 41, 568–578 (2015).
Article PubMed PubMed Central Google Scholar
Malassis, R., Rey, A. & Fagot, J. Non-adjacent dependencies processing in human and non-human primates. Cogn. Sci. 42, 1677–1699 (2018).
Article Google Scholar
Rey, A., Minier, L., Malassis, R., Bogaerts, L. & Fagot, J. Regularity extraction across species: associative learning mechanisms shared by human and non-human primates. Top. Cogn. Sci. 11, 573–586 (2019).
Article PubMed Google Scholar
Yeaton, J., Tosatto, L., Fagot, J., Grainger, J. & Rey, A. Simple questions on simple associations: regularity extraction in non-human primates. Learn Behav. 51, 392–401 (2023).
Article PubMed PubMed Central Google Scholar
Endress, A. D., Cahill, D., Block, S., Watumull, J. & Hauser, M. D. Evidence of an evolutionary precursor to human language affixation in a non-human primate. Biol. Lett. 5, 749–751 (2009).
Article PubMed PubMed Central Google Scholar
Watson, S. K. et al. Nonadjacent dependency processing in monkeys, apes, and humans. Sci. Adv. 6, eabb0725 (2020).
Article PubMed PubMed Central Google Scholar
Wilson, B., Marslen-Wilson, W. D. & Petkov, C. I. Conserved sequence processing in primate frontal cortex. Trends Neurosci. 40, 72–82 (2017).
Article CAS PubMed PubMed Central Google Scholar
Wilson, B. et al. Non-adjacent dependency learning in humans and other animals. Top. Cogn. Sci. 12, 843–858 (2020).
Article PubMed Google Scholar
van Heijningen, C. A. A., de Visser, J., Zuidema, W. & ten Cate, C. Simple rules can explain discrimination of putative recursive syntactic structures by a songbird species. Proc. Natl Acad. Sci. 106, 20538–20543 (2009).
Article PubMed PubMed Central Google Scholar
Jiang, X. et al. Production of supra-regular spatial sequences by Macaque monkeys. Curr. Biol. 28, 1851–1859.e4 (2018).
Article CAS PubMed PubMed Central Google Scholar
Tosatto, L., Fagot, J., Nemeth, D. & Rey, A. Chunking as a function of sequence length. Anim. Cogn. 28, 2 (2024).
Article PubMed PubMed Central Google Scholar
Tosatto, L., Fagot, J., Nemeth, D. & Rey, A. The evolution of chunks in sequence learning. Cogn. Sci. 46, e13124 (2022).
Article PubMed Google Scholar
Fitch, W. T. Bio-linguistics: monkeys break through the syntax barrier. Curr. Biol. 28, R695–R697 (2018).
Article Google Scholar
Plante, E. & Gómez, R. L. Learning without trying: the clinical relevance of statistical learning. Lang. Speech Hear. Serv. Sch. 49, 710–722 (2018).
Article PubMed PubMed Central Google Scholar
Ballan, R., Durrant, S. J., Manoach, D. S. & Gabay, Y. Failure to consolidate statistical learning in developmental dyslexia. Psychon. Bull. Rev. 30, 160–173 (2023).
Article PubMed Google Scholar
Daikoku, T. et al. Neural correlates of statistical learning in developmental dyslexia: an electroencephalography study. Biol. Psychol. 181, 108592 (2023).
Article PubMed Google Scholar
Gabay, Y., Thiessen, E. D. & Holt, L. L. Impaired statistical learning in developmental dyslexia. J. Speech Lang. Hear. Res. 58, 934–945 (2015).
Article PubMed PubMed Central Google Scholar
Ozernov-Palchik, O., Qi, Z., Beach, S. D. & Gabrieli, J. D. E. Intact procedural memory and impaired auditory statistical learning in adults with dyslexia. Neuropsychologia 188, 108638 (2023).
Article PubMed PubMed Central Google Scholar
Lum, J. A. G., Ullman, M. T. & Conti-Ramsden, G. Procedural learning is impaired in dyslexia: evidence from a meta-analysis of serial reaction time studies. Res. Dev. Disabil. 34, 3460–3476 (2013).
Article PubMed PubMed Central Google Scholar
van Witteloostuijn, M., Boersma, P., Wijnen, F. & Rispens, J. Visual artificial grammar learning in dyslexia: a meta-analysis. Res. Dev. Disabil. 70, 126–137 (2017).
Article PubMed Google Scholar
Schmalz, X., Altoè, G. & Mulatti, C. Statistical learning and dyslexia: a systematic review. Ann. Dyslexia 67, 147–162 (2017).
Article PubMed Google Scholar
Singh, S. & Conway, C. M. Unraveling the interconnections between statistical learning and dyslexia: a review of recent empirical studies. Front. Hum. Neurosci. 15, 734179 (2021).
Article PubMed PubMed Central Google Scholar
Doyon, J. et al. Contributions of the basal ganglia and functionally related brain structures to motor learning. Behav. Brain Res. 199, 61–75 (2009).
Article PubMed Google Scholar
Ullman, M. T., Earle, F. S., Walenski, M. & Janacsek, K. The neurocognition of developmental disorders of language. Annu. Rev. Psychol. 71, 389–417 (2020).
Article PubMed Google Scholar
Ambrus, G. G. et al. When less is more: Enhanced statistical learning of non-adjacent dependencies after disruption of bilateral DLPFC. J. Mem. Lang. 114, 104144 (2020).
Article Google Scholar
Smalle, E. H. M., Panouilleres, M., Szmalec, A. & Möttönen, R. Language learning in the adult brain: disrupting the dorsolateral prefrontal cortex facilitates word-form learning. Sci. Rep. 7, 13966 (2017).
Article PubMed PubMed Central Google Scholar
Hendricks, M. A., Conway, C. M. & Kellogg, R. T. Using dual-task methodology to dissociate automatic from nonautomatic processes involved in artificial grammar learning. J. Exp. Psychol. Learn. Mem. Cogn.39, 1491–1500 (2013).
Article PubMed Google Scholar
Smalle, E. H. M., Daikoku, T., Szmalec, A., Duyck, W. & Möttönen, R. Unlocking adults’ implicit statistical learning by cognitive depletion. Proc. Natl Acad. Sci. 119, e2026011119 (2022).
Article CAS PubMed PubMed Central Google Scholar
Woollams, A. M., Madrid, G. & Lambon Ralph, M. A. Using neurostimulation to understand the impact of pre-morbid individual differences on post-lesion outcomes. Proc. Natl Acad. Sci. 114, 12279–12284 (2017).
Article CAS PubMed PubMed Central Google Scholar
Snowling, M. J., Hayiou-Thomas, M. E., Nash, H. M. & Hulme, C. Dyslexia and developmental language disorder: comorbid disorders with distinct effects on reading comprehension. J. Child Psychol. Psychiatry 61, 672–680 (2020).
Article PubMed Google Scholar
Georgiou, G. K., Martinez, D., Vieira, A. P. A. & Guo, K. Is orthographic knowledge a strength or a weakness in individuals with dyslexia? Evidence from a meta-analysis. Ann. Dyslexia 71, 5–27 (2021).
Article PubMed Google Scholar
Catts, H. W., Adlof, S. M., Hogan, T. P. & Weismer, S. E. Are specific language impairment and dyslexia distinct disorders? J. Speech Lang. Hear. Res. 48, 1378–1396 (2005).
Article PubMed Google Scholar
Pennington, B. F. et al. Individual prediction of dyslexia by single versus multiple deficit models. J. Abnorm. Psychol. 121, 212–224 (2012).
Article PubMed Google Scholar
Gooch, D., Hulme, C., Nash, H. M. & Snowling, M. J. Comorbidities in preschool children at family risk of dyslexia. J. Child Psychol. Psychiatry 55, 237–246 (2014).
Article PubMed Google Scholar
Haig, B. D. Detecting psychological phenomena: taking bottom-up research seriously. Am. J. Psychol. 126, 135–153 (2013).
Article PubMed Google Scholar
Houwer, J. D. Why the cognitive approach in psychology would profit from a functional approach and vice versa. Perspect. Psychol. Sci. 6, 202–209 (2011).
Article PubMed Google Scholar
Hughes, S., De Houwer, J. & Perugini, M. The functional-cognitive framework for psychological research: controversies and resolutions. Int. J. Psychol. 51, 4–14 (2016).
Article PubMed Google Scholar
Darwin, C. On the Origin of Species by Means of Natural Selection (John Murray, London, 1859).

Download references

Acknowledgements

Preparation of this manuscript was supported by a Grinnell College Faculty Scholarship Competitive Grant (CMC), a Wellcome Trust Grant (213686/Z/18/Z) (AEM), and by the National Institutes of Health’s Office of the Director, Office of Research Infrastructure Programs (P51OD011132) (BW).

Author information

Authors and Affiliations

Department of Psychology, Grinnell College, Grinnell, IA, USA
Christopher M. Conway
Department of Education, University of Oxford, Oxford, UK
Holly E. Jenkins
Department of Psychology, Lancaster University, Lancaster, UK
Alice E. Milne
Ear Institute, University College London, London, UK
Alice E. Milne
Department of Language and Cognition, University College London, London, UK
Sonia Singh
Department of Psychology, Emory University, Atlanta, GA, USA
Benjamin Wilson
Emory National Primate Research Center, Atlanta, GA, USA
Benjamin Wilson

Authors

Christopher M. Conway
View author publications
Search author on:PubMed Google Scholar
Holly E. Jenkins
View author publications
Search author on:PubMed Google Scholar
Alice E. Milne
View author publications
Search author on:PubMed Google Scholar
Sonia Singh
View author publications
Search author on:PubMed Google Scholar
Benjamin Wilson
View author publications
Search author on:PubMed Google Scholar

Contributions

All authors (C.M.C., H.E.J., A.E.M., S.S., and B.W.) contributed to the writing of the manuscript and read and approved the final version.

Corresponding author

Correspondence to Christopher M. Conway.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Conway, C.M., Jenkins, H.E., Milne, A.E. et al. Addressing the theory crisis in statistical learning research. npj Sci. Learn. 10, 68 (2025). https://doi.org/10.1038/s41539-025-00359-6

Download citation

Received: 12 February 2025
Accepted: 25 August 2025
Published: 29 September 2025
DOI: https://doi.org/10.1038/s41539-025-00359-6

Subjects

Abstract

Similar content being viewed by others

Changes in statistical learning across development

Recovering autonomous work after the pandemic: analysis in Calculus for incoming Students in Technical Education degrees

Heterogeneity in strategy use during arbitration between experiential and observational learning

Introduction

Challenges facing SL research: lack of robust phenomena

Challenges facing SL research: issues with construct validity

Challenges facing SL research: issues with establishing causality

Evaluating statistical learning phenomena

Modality effects

Robustness

Construct validity

Causality

Age effects

Robustness

Construct validity

Causality

Species differences

Robustness

Construct validity

Causality

Atypical statistical learning in developmental dyslexia

Robustness

Construct validity

Causality

Discussion and recommendations

There may be value in taking a “bottom-up” approach to theory building

Relevant concepts need to be clearly and transparently defined

The field will benefit from a diversity of statistical learning measures

There is value in methods that can be commonly used across different modalities, ages, species, and populations

Statistical learning should not be studied in a vacuum

Interactions among phenomena should be explored

Conclusion

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Search

Quick links