Modality, presentation, domain and training effects in statistical learning

Lukics, Krisztina Sára; Lukács, Ágnes

doi:10.1038/s41598-022-24951-7

Download PDF

Article
Open access
Published: 03 December 2022

Modality, presentation, domain and training effects in statistical learning

Krisztina Sára Lukics^1,2 &
Ágnes Lukács^1,2

Scientific Reports volume 12, Article number: 20878 (2022) Cite this article

2858 Accesses
8 Citations
Metrics details

Subjects

Abstract

While several studies suggest that the nature and properties of the input have significant effects on statistical learning, they have rarely been investigated systematically. In order to understand how input characteristics and their interactions impact statistical learning, we explored the effects of modality (auditory vs. visual), presentation type (serial vs. simultaneous), domain (linguistic vs. non-linguistic), and training type (random, starting small, starting big) on artificial grammar learning in young adults (N = 360). With serial presentation of stimuli, learning was more effective in the auditory than in the visual modality. However, with simultaneous presentation of visual and serial presentation of auditory stimuli, the modality effect was not present. We found a significant domain effect as well: a linguistic advantage over nonlinguistic material, which was driven by the domain effect in the auditory modality. Overall, the auditory linguistic condition had an advantage over other modality-domain types. Training types did not have any overall effect on learning; starting big enhanced performance only in the case of serial visual presentation. These results show that input characteristics such as modality, presentation type, domain and training type influence statistical learning, and suggest that their effects are also dependent on the specific stimuli and structure to be learned.

Effects of training and using an audio-tactile sensory substitution device on speech-in-noise understanding

Article Open access 25 February 2022

Speech in noise perception improved by training fine auditory discrimination: far and applicable transfer of perceptual learning

Article Open access 09 November 2020

Directed attention influences optimality of top-down and bottom-up multi-modal perceptual integration

Article Open access 16 July 2025

Introduction

Our surroundings are full of structured patterns and regularities. In order to efficiently operate in this complex environment, an organism has to be equipped with abilities to find, learn, and utilize these environmental structures and regularities. Statistical learning is a powerful mechanism of extracting and encoding structure from environmental stimuli¹. This form of learning is ubiquitous in human cognition: studies have shown that it is present in the auditory, visual, and tactile modalities and across the linguistic and nonlinguistic domains^{2,3,4,5,6,7,8,9,10,11,12}, and it also operates in multimodal, visuomotor tasks^13,14, as well.

Statistical learning supports many skills in our everyday life. For instance, language, consisting of complex patterns and regularities on multiple levels, has been suggested to rely on it^{15,16,17,18,19,20,21,22,23,24,25}. While the contribution of statistical learning might be most frequently highlighted in language, several results have shown that this mechanism is not limited to it: it has an important role in domains such as music acquisition²⁶, event processing²⁷, or acquiring complex visual stimuli like scenes or faces²⁸, suggesting a diverse and varied role for statistical learning in human cognition. As the human cognitive system faces great diversity in learning materials, differences in the properties of the input may impose different constraints on statistical learning in each area^1,29. Input constraints are especially important in statistical learning because this form of learning is model-free and input-driven compared to other forms of learning like reinforcement learning or declarative learning²⁹. To understand how this fundamental mechanism operates in different areas of cognition, we aim to uncover how input characteristics and their interactions affect learning.

While the variability of areas where statistical learning is present may suggest generality, direct comparisons of learning the same structure with stimuli from different domains and modalities indicate the presence of modality and domain specific constraints. (In the present paper, we use domain to refer to the content of representations, more specifically, to denote the linguistic-nonlinguistic distinction in our tasks). These effects have mostly been demonstrated in artificial language learning tasks, where a few novel items are organized into sequences based on simple patterns. After being exposed to a set of grammatical sequences, humans are able to distinguish grammatical from ungrammatical sequences. These studies have shown that the efficiency of statistical learning of serially presented (i.e., presenting one stimulus after the other) nonlinguistic auditory patterns exceeds the extraction of serial nonlinguistic visual patterns, which in turn is better than learning serial nonlinguistic tactile patterns³. In general, statistical learning is assumed to have modality- or even stimulus specific characteristics^1,29.

Importantly, these modality effects are likely to result from differences in parameters of optimal presentation. While sequential information in the auditory modality is only available through serial presentation of stimuli, for visual sequences, serial and simultaneous presentation are both feasible. Simultaneous presentation, where items of a sequence are presented together at the same time, seems to be the optimal in the case of statistical learning of nonlinguistic visual sequences^30,31. When visual information is presented simultaneously, performance is similar to nonlinguistic auditory learning^30,31. Presentation rates also affect learning differently in different modalities, and slower rates seem to facilitate visual statistical learning: when presentation rate is slower with serial linguistic visual than with serial linguistic auditory stimuli, learning performance is equivalent between the modalities⁶⁹.

While modality differences in statistical learning have been demonstrated in several studies, tests of domain effects, e.g. direct comparisons of linguistic versus nonlinguistic materials are hard to find. One notable exception is Saffran³¹, who explored both domain (linguistic versus nonlinguistic) and modality (auditory versus visual) effects in an artificial grammar learning task and found no overall advantage of sequence learning in the linguistic over the nonlinguistic domain (or in the auditory over visual modality) with serial presentation of sequences. However, the focus of this study was on contrasting two types of grammars (grammars with predictive and non-predictive dependencies) within each condition, instead of directly comparing performance across domains and modalities. Although it was not the primary focus of their studies, Hoch, Tyler and Tillmann⁷⁰ directly compared statistical learning in the linguistic and nonlinguistic domains, observing significantly higher levels of learning in the linguistic than in the nonlinguistic domain.

Besides constraints by modality, presentation type and domain, different arrangements of stimuli during training (training type) also influence statistical learning. The starting small hypothesis assumes that incremental presentation of stimuli of different length (and complexity) enhances statistical learning in the case of humans and neural networks³². In another formulation, under the less is more hypothesis^33,34, cognitive limitations like reduced working memory capacity, help learning complex patterns and systems. Research on human learners and less is more/starting small is methodologically diverse and has given controversial results^35,36,37,38. On the one hand, contrary to the predictions of the less is more hypothesis, several studies found that the acquisition of grammar structures in artificial grammar learning tasks is more effective in adults than in children³⁹. On the other hand, simulating reduced working memory capacity in adults seems to facilitate learning in some⁴⁰, but not in other studies⁴¹. Starting big arrangement of stimuli, starting with the longer strings of the grammar, has also been argued and demonstrated to result in superior performance by allowing larger chunks to be learned first and be parsed later⁴². However, it may also lead to false hypotheses about grammar structure^32,43, or prevent generalization of rules⁴⁰. To summarize, starting small and starting big training types lead to more efficient learning in some cases, but further research is needed to test the conditions in which they boost learning.

Although effects of input characteristics like modality, presentation, domain and training type have been examined before, previous research only investigated these effects on statistical learning separately calling for further studies with direct comparisons. Furthermore, many studies used different statistical learning designs, with differences in patterns, stimuli and presentation arrangement. Our aims in this study were to examine modality, presentation, domain and training type effects using Saffran’s³¹ predictive grammar in order to extend the results of the original study (a) by systematically investigating all combinations of the examined input effects and their interactions (e.g. by also including visual linguistic conditions), and (b) by directly comparing learning performance across conditions. We compare the efficiency of statistical learning in visual (both in serial and simultaneous presentation types) and auditory modalities and across linguistic versus non-linguistic domains. We also wanted to test how training type, namely starting small and starting big influences learning across these conditions, as training effects have not been examined with finite state, category-based grammars. Figure 1 summarizes the design of experimental conditions in the study.

Our hypotheses were the following:

1.
Based on previous findings, we expected the advantage of learning in the auditory modality over the visual modality with serial presentation. We also hypothesized that when presentation is optimized for modality, this advantage disappears, and performance in the serial auditory and in the simultaneous visual tasks would be on similar levels.
2.
In the present study, we aimed at directly comparing the acquisition of statistical patterns in the linguistic and nonlinguistic domains. Based on the results of Saffran³¹, we expected that the linguistic versus nonlinguistic status of stimuli would not have an effect on learning efficiency.
3.
Training effects, starting small and starting big, have not been examined with finite state, category-based grammars. We hypothesized that starting small would facilitate learning compared to presenting training sequences in a random order, as starting small enables the generation of simple and flexible hypotheses about the rule^32,43. In contrast, starting big would mainly facilitate learning of specific item-relations, and as a result, we expected it would result in lower learning performance than random training due to less effective hypothesis generation⁴².

These hypotheses can be translated to the following formulations in our current experimental design, motivating three sets of analyses:

1.
With serial presentation of stimuli, we expected the advantage of learning in the auditory modality in comparison to the visual modality. We hypothesized that there would be no domain effect, that is, the linguistic conditions would not differ from the nonlinguistic conditions. We also expected that starting small training would lead to higher, while starting big would lead to lower performance than presenting training sequences with a different length in a random order.
2.
In the visual conditions, we expected the advantage of simultaneous over serial presentation. In this case, we also expected no domain effect, and an advantage of starting small and disadvantage of starting big stimuli relative to random training.
3.
With presentation optimized for modality, we expected that performance in the serial auditory and in the simultaneous visual tasks would be on similar levels. Here, we also hypothesized no domain effect, and an advantage of starting small and disadvantage of starting big stimuli relative to random training.

Method

Participants

360 young adults participated in the study. Most of them were university students who were recruited through facultative cognitive psychology courses at the Budapest University of Technology and Economics, and received course credit for their participation. The rest of the participants were volunteers who were recruited via convenience sampling. Inclusion criteria were normal or corrected-to-normal hearing and vision, and Hungarian as a native language. Participants were asked to report any neurological and psychiatric conditions (none were reported in our sample). Mean age was 22.5 (SD = 3.9, minimum = 18.1, maximum = 55.8), and 255 females and 105 males participated in the study. Age information was missing in the case of two participants. All participants were tested with their informed consent, in accordance with the principles set out in the Declaration of Helsinki and the stipulations of the local institutional review board (United Ethical Review Committee for Research in Psychology, ethical approval number: EPKEB-2018/87).

Stimuli

Throughout the conditions, stimuli varied by modality (auditory versus visual) and domain (nonlinguistic versus linguistic). For all conditions, our aim was to design diverse stimulus sets where individual stimuli are well discriminable from each other. For the auditory nonlinguistic conditions, we divided a frequency range that was conveniently perceivable through our laboratory headphones (220–831 Hz) into 15 equal sections following steps of the musical scale to obtain 16 tones. As a result, we obtained intervals larger than standard semitones, and almost as large as standard whole tones (220 Hz, 240 Hz, 263 Hz, 287 Hz, 314 Hz, 343 Hz, 374 Hz, 409 Hz, 447 Hz, 488 Hz, 534 Hz, 583 Hz, 637 Hz, 696 Hz, 761 Hz, 831 Hz). Each tone was 470 ms long. For the auditory linguistic conditions, we used Hungarian CVC nonwords compiled from diverse Hungarian phonemes to promote discriminability (bif, dők, dup, gal, hep, kav, lam, lor, mib, neb, péf, rász, rud, szig, tez, sot). Note that some of the nonwords are four characters long as they include phonemes with a digraph (two-character grapheme) equivalent (‘sz’). Nonwords were recorded from a Hungarian female speaker, and the average length of syllables was 470 ms. In the visual nonlinguistic condition, 16 meaningless symbols were used that were rich in detail, and easily distinguishable from each other. In the visual linguistic conditions, the same syllables were used as in the auditory linguistic condition. Syllables were visually presented on white screen in black font. Individual items in each stimulus type were assigned to categories (as illustrated in Fig. 2), and the rules of the artificial grammar were defined over these categories.

With the help of the grammar (given in Fig. 3, taken from Saffran³¹) and the condition-specific categorized vocabularies, we generated 58 three to five items long grammatical sentences and 32 two to three items long phrases for the learning phases (90 sequences altogether). Phrases were parts of grammatical sentences. We also generated 24 pairs of grammatical and ungrammatical sequences (9 four- and 15 five item long sequences) for the test phases in all conditions. Grammatical sentences followed the rules of the grammar, while the ungrammatical ones included a violation of one of the grammatical rules: (1) sentences must contain an AP phrase, (2) D words follow A words, while G words follow C words, (3) sentences must contain an F word, (4) CP phrases must precede F words. As a result, there were four violation types, one for each rule: (1) sentences starting with a BP phrase instead of an AP phrase, (2) sentences where D and G words were interchanged so G words followed A words and D words followed C words, (3) sentences where F words were exchanged for G words, (4) sentences where CP phrases or parts of the CP phrases were missing before F words. Each violation type was represented by six ungrammatical strings. Members of categories were randomly distributed in sentences in the case of each category. The full set of training and test sequences together with their statistical properties for the linguistic conditions are included as supplementary material online. Sequences of the nonlinguistic conditions were parallel to those of linguistic conditions, that is, each syllable corresponded to a pure tone and a symbol, respectively. The modality and domain variations on conditions only differed in their stimulus set.

Procedure

Participants were tested in a silent room in groups of two or three. The testing was administered using E-Prime 2.0 Professional. The test administration took cca. 15 min, and consisted of a training phase and a test phase for all conditions.

In the auditory conditions, items were presented with no pauses between them. In the visual conditions, we applied two presentation types: in the serial conditions, one item was presented at a time on the center of the screen for 800 ms, followed by the next item with no pauses, while in the simultaneous conditions, all items of a sentence were presented together on the screen at the same time. (Pilot data from our lab on a simpler segmentation task showed no learning effect in the case of visual statistical learning when stimulus timing was matched to that of acoustic statistical learning and set to 470 ms. This was one of the reasons for using a longer presentation time: we wanted to avoid floor effects in a more complex task in the visual modality. Choosing longer presentation times was also motivated by earlier studies showing that longer presentation times in visual statistical learning indeed promote learning^69,76,77. Since visual presentation rates vary between 400 and 1200 ms in the literature, we decided to go with a mid-range 800 ms that was significantly longer than what we used in our pilot studies.) Presentation time was adjusted to sequence length (the number of items times 800 ms). During the training phase, participants were instructed to simply attend to the presented sequences.

In all combinations of modality, presentation and domain, we examined the effects of three different training types. All conditions presented the same set of sequences; small and big were not defined in absolute terms, they refer to the relative length of sentences within the same training set. In the random conditions, sequences of different length were presented in a random order; the starting small conditions involved incremental presentation of sentences ordered by length, starting with the shortest sequence; the starting big condition was the reverse of the starting small condition, starting with the longest sequences and gradually proceeding towards the shortest ones. It is important to point out that the shortest strings were not full sentences of the grammar, but they were structural units (phrases) of the language.

In the two-alternative forced choice (2AFC) test phase, participants were told that the sequences presented before were in an unknown language, and then they were presented with 24 sequence pairs of a grammatical sentence and a sentence containing a violation in each of the 24 trials. The grammatical-ungrammatical order within the sequence pair was counterbalanced across the trials. The order of the trials was random, but sentence-pairs were preset. Participants were instructed to choose the one which was more similar to sentences of the unknown language in the training phase and indicate it by pressing the corresponding key (‘1’ for the first sentence and ‘2’ for the second). The two sentences followed each other with 2000 ms pauses. Higher than chance scores (choosing the grammatical member of the pair significantly more than 50% of the time) was taken as evidence of learning.

Results

Data were analyzed and visualized using IBM SPSS Statistics 20, JASP version 0.15.0.0⁷⁸ and the R package ggplot2, version 3.3.5⁴⁴. Descriptive statistics of accuracies in the 2AFC task are displayed in Table 1.

Table 1 Descriptive statistics of groups in different modality, presentation, domain and training conditions.

Full size table

Post-hoc calculations of power estimates are included in Appendix 1. Detailed analyses of performance as a function of statistical regularities and violation types, as well as trial level performances were also conducted. These analyses showed that (1) participants performed above chance level in all violation types, (2) there were significant differences between performance by different violation types, (3) higher accuracy in violation types tended to co-occur with higher discriminability between the grammatical versus ungrammatical sequence based on of statistical features, and (4) statistical feature differences between grammatical and ungrammatical test items influenced accuracy on the word level, but not on the category level. These additional analyses are included in the supplementary materials.

First, we analyzed results for all conditions with serial presentation of stimuli. We conducted a three-way ANOVA to test the effects of Modality, Domain and Training Type. The effect of Modality was significant, F(1,228) = 19.12, p < 0.001, η_p² = 0.08, BF₁₀ = 5.557e+6, performance in the auditory modality was higher than in the visual modality. The effect of Domain was also significant, F(1,228) = 11.83, p = 0.001, η_p² = 0.05, BF₁₀ = 186,759.88, showing that participants performed better in conditions in the linguistic than in the nonlinguistic domain. The effect of Training Type was not significant, F(2,228) = 1.51, p = 0.223, η_p² = 0.01, BF₁₀ = 0.914. The interaction of Modality*Domain was significant, F(1,228) = 26.83, p < 0.001, η_p² = 0.11, BF₁₀ = 54,417.28 (Fig. 4). Post hoc analyses showed that in the auditory modality, performance in the linguistic domain was significantly higher than in the nonlinguistic domain, t(118) = -6.95, p < 0.001, r = 0.54, BF₁₀ = 3.761e+7. In the visual modality, the difference between the two domains was not significant, t(118) = 1.08, p = 0.284, r = 0.10, BF₁₀ = 0.33. In the nonlinguistic domain, Modality did not affect performance (t(118) = 0.57, p = 0.570, r = 0.05, BF₁₀ = 0.23), while in the linguistic domain, the efficiency of auditory and visual learning differed significantly, with higher scores in the auditory linguistic than in the visual linguistic condition, t(118) = 6.52, p < 0.001, r = 0.51, BF₁₀ = 4.910e+6.

The interaction of Modality*Training Type was also significant, F(2,228) = 6.06, p = 0.003, η_p² = 0.05, BF₁₀ = 4.59 (Fig. 5). Post hoc tests showed that in the auditory modality, Training Type has no significant effect, F(2,117) = 0.99, p = 0.374, η_p² = 0.02, BF₁₀ = 0.18. However, in the visual modality, the effect of Training Type was significant, F(2,117) = 5.32, p = 0.006, η_p² = 0.08, BF₁₀ = 6.07. Tukey pairwise comparisons showed that only the starting small and starting big groups were different (with higher performance in the starting small than in the starting big condition), p = 0.005, BF₁₀ = 13.47, the random and starting small, and the random and starting big groups did not differ from each other, p = 0.110, BF₁₀ = 1.96 and p = 0.459, BF₁₀ = 0.41, respectively. In the case of the latter two comparisons, Bayes factors did not show evidence for equal performance in different training types. Further analyses showed that learning in the auditory modality was more efficient than in the visual modality in the case of the starting small condition, t(78) = 5.06, p < 0.001, r = 0.50, BF₁₀ = 5497.80. The two modalities did not differ in the random and the starting big groups, t(78) = 1.74, p = 0.085, r = 0.19, BF₁₀ = 0.86, t(78) = 0.50, p = 0.621, r = 0.06, BF₁₀ = 0.26. However, in the former comparison, the Bayesian analysis did not show evidence for equal performance in the two modalities. No other interactions were significant.

We performed a second three-way ANOVA to test the effects of Presentation Type, Domain and Training Type on accuracy in the visual conditions. The effect of Presentation Type was significant, F(1,228) = 20.38, p < 0.001, η_p² = 0.08, BF₁₀ = 1092.94; performance was higher with simultaneous presentation than with serial presentation. The main effects of Domain and Training Type were not significant, F(1,228) = 2.11, p = 0.148, η_p² = 0.01, BF₁₀ = 0.18, and F(2,228) = 0.10, p = 0.371, η_p² = 0.01, BF₁₀ = 0.56, respectively. The interaction of Presentation Type*Training Type was significant, F(2,228) = 6.10, p = 0.003, η_p² = 0.05, BF₁₀ = 2.86 (Fig. 6). Post hoc analyses showed that in the case of serial presentation, the effect of Training Type was significant, F(2,117) = 5.32, p = 0.006, η_p² = 0.08, BF₁₀ = 6.07; Tukey pairwise comparisons showed that starting small was less efficient than starting big training, p = 0.005, BF₁₀ = 13.47, but learning with random and starting small, and random and starting big training did not differ from each other, p = 0.110, BF₁₀ = 1.96 and p = 0.459, BF₁₀ = 0.41, respectively. (Note that this analysis is the same as the post hoc analysis of Training Type in the case of the visual modality in the previous ANOVA.) The effect of Training Type was not significant in the case of simultaneous presentation, F(2,117) = 1.75, p = 0.178, η_p² = 0.03, BF₁₀ = 0.33. When analyzing the effect of Presentation Type in each Training Type, we found that in the case of random and starting small training, simultaneous presentation resulted in higher performance levels than serial presentation, t(78) = -2.77, p = 0.007, r = 0.30, BF₁₀ = 5.99, and t(78) = -5.58, p < 0.001, r = 0.53, BF₁₀ = 37,631.56. In the case of starting big training, there was no difference between presentation types, t(78) = -0.06, p = 0.951, r = 0.01, BF₁₀ = 0.23. No other interactions were significant.

We performed a third three-way ANOVA to test the effects of Modality, Domain, and Training Type for optimal presentation for each modality, i.e., serial presentation for auditory stimuli, and simultaneous presentation for visual stimuli. With presentation type fitted to modality, the effect of Modality was not significant, F(1,228) = 0.34, p = 0.560, η_p² < 0.01, BF₁₀ = 1335.81. On the other hand, the effect of Domain was significant, F(1,228) = 13.34, p < 0.001, η_p² = 0.06, BF₁₀ = 46,289.46, showing that participants performed better in conditions in the linguistic domain than in the nonlinguistic domain. The effect of Training Type was not significant, F(2,228) = 2.30, p = 0.102, η_p² = 0.02, BF₁₀ = 0.17. The interaction of Modality*Domain was significant, F(1,228) = 26.24, p < 0.001, η_p² = 0.10, BF₁₀ = 6922.99. This interaction is illustrated in Fig. 7. Post hoc analyses showed that in the auditory modality, performance in the nonlinguistic domain was significantly lower than in the linguistic domain, t(118) = 6.95, p < 0.001, r = 0.49, BF₁₀ = 3.761e+7. In the visual modality, the difference between the two domains was not significant, t(118) = 0.95, p = 0.346, r = 0.09, BF₁₀ = 0.291. In the nonlinguistic domain, the performance in the visual modality was higher than in the auditory modality, t(118) = 4.03, p < 0.001, r = 0.35, BF₁₀ = 216.78, and in the linguistic domain, we observed the opposite pattern, with significantly higher performance in the auditory than in the visual modality, t(118) = 3.21, p = 0.002, r = 0.28, BF₁₀ = 17.97. No other interactions were significant.

As pointed out by one of the reviewers of the original manuscript, in the auditory nonlinguistic conditions, the use of musical tones may give rise to musical features like contours (the ascending and descending pattern between tones) and intervals (the relative pitch change between tones)⁷⁵. Taking this into account, a possible explanation for lower performance in the auditory nonlinguistic condition is that statistical patterns of these emergent musical features might conflict with statistical information imposed by the grammar alone. To check this possibility, we compared the main effect of Domain and the Modality*Domain interaction in test trials where grammatical and musical patterns converged (based on the statistics of the learning phase) and in test trials where grammatical and musical statistics diverged in supporting a choice between the grammatical versus ungrammatical item. We found that the main effect of Domain and the Modality*Domain interaction was more prominent in test trials where emergent musical patterns and grammatical patterns did not converge. A thorough description of the recoding process and the analysis is provided in Appendix 2.

Since age was not entirely balanced between groups, we checked whether it affected learning to control for potential biases. Age had a very weak, but significant negative relationship with performance, r(356) = − 0.11, p = 0.039, however, the Bayesian analysis did not show evidence for this relationship, BF₁₀ = 0.64. To examine a potential effect on the results of ANOVA analyses, we performed three further ANCOVAs with Age as a covariate. When testing the effects of Modality, Domain and Training Type in the case of serial presentation, we found that Age had no significant effect on performance, F(1,226) = 0.07, p = 0.790, η_p² < 0.01, BF₁₀ = 0.17. Similarly, when testing the effect Presentation Type, Domain and Training Type in the case of visual stimuli, Age had no significant effect, F(1,226) = 0.17, p = 0.677, η_p² = 0.01, BF₁₀ = 0.16. When testing the effect of Modality, Domain, and Training Type in the case of optimal presentation type, Age was a significant covariate, F(1,225) = 6.02, p = 0.015, η_p² = 0.03, BF₁₀ = 8.01, but including it as a covariant did not change the pattern of findings: the effects of Domain and Modality*Domain remained significant.

Summary and discussion

Statistical learning operates across many different areas of cognition on different stimuli, but the effects of modality, presentation, domain and training type together with the interaction of these factors have not been examined together and systematically in the statistical learning literature. To fill this gap, we investigated the effects of these factors in an artificial grammar task. When stimuli were presented serially, learning was more effective in the auditory than in the visual modality. This modality effect was particularly pronounced in the linguistic domain. With simultaneous presentation of visual stimuli, the auditory advantage over the visual modality disappeared. A significant domain effect showed that learning linguistic patterns results in higher performance than learning nonlinguistic patterns. However, the linguistic advantage over learning the nonlinguistic material was only present in the auditory modality. The auditory linguistic condition had an overall advantage over other modality-domain types. Training type did not have any general effect on the acquisition of the grammar, but starting big enhanced performance in the case of serial visual presentation relative to starting small training, and starting small training with serial visual materials resulted in lower performance than starting small training with simultaneous visual materials. The results and their implications are discussed in the context of earlier findings and in more detail in the following sections.

Effects of modality and presentation

We expected an auditory advantage in statistical learning with serial presentation of stimuli. This assumption was supported by our results: the grammar was easier to learn in the auditory than in the visual modality, in line with previous results by Conway and Christiansen³, who found the same pattern with a simpler grammar. These observations suggest that regardless of the complexity and structure of the pattern to be learned, when stimuli are presented serially, learning is more effective in the auditory than in the visual modality. Such modality effects in statistical learning tasks might reflect differences in general information processing mechanisms in sensory modalities. Supporting this notion, Conway and Christiansen³⁰ demonstrated that well-known primacy and recency effects in serial recall^45,46 are also present in statistical learning. Moreover, the advantage of auditory over visual presentation, demonstrated in previous studies and in the current one, had also been described outside the field of statistical learning, for instance, in the memory for order of words in sequences in a word list recall task⁴⁷.

However, with presentation type optimized, i.e., when items of the sequence are presented serially in the auditory, and simultaneously in the visual modality, the auditory advantage disappeared and learning was equally efficient in both modalities, in concert with previous results of Conway and Christiansen³⁰. The findings of Saffran³¹ also provide indirect support for this claim, however, she only found an advantage of simultaneous over serial presentation for visual stimuli with the same predictive grammar we also used (but not with the nonpredictive grammar). As she discusses, it is unclear whether this pattern was due to the advantage of visual simultaneous learning for the predictive grammar, or the disadvantage for the non-predictive grammar. Taken together, (1) our results support the advantage of auditory over visual statistical learning with serial presentation; (2) simultaneous presentation seems to benefit visual statistical learning of sequences over visual serial presentation; and (3) when presentation is optimized for modality, there is no difference between modalities in learning efficiency.

The advantage of simultaneous compared to serial visual presentation raises the possibility that modality effects might be specific to or at least interact with structure type. In statistical learning, modality effects are generally investigated with sequential structures (with some exceptions⁴⁸). However, while auditory perception and processing seems to be suited for processing temporal information, which is inherently sequential, vision is better suited to processing spatial than temporal information, which can be both sequential and nonsequential (as concluded by Freides⁴⁹, and Conway and Christiansen³, but see also other studies^48,50,51,52). Testing modality effects can be challenging in the case of nonsequential structures, although not impossible⁴⁸, due to the sequential organization of most types of auditory information. As this modality effect might be limited to sequential processing, further studies should target nonsequential structures to broaden our knowledge about modality effects in statistical learning and other domains of cognition. To conclude, the present study (1) confirms modality effects observed in earlier studies and extends them to predictive dependencies and a category-based grammar, (2) shows that these modality effects can be structure dependent.

Domain effects

Based on previous findings³¹, we expected no advantage of learning the grammar with linguistic over nonlinguistic stimuli (although see⁷⁰ for results with a linguistic advantage with a different design). This assumption was only partially supported by our findings. In the case of serial presentation, performance was higher in linguistic than in nonlinguistic conditions. We observed a similar domain effect in the analysis including serial auditory and simultaneous visual learning (i.e., the optimal presentation for each modality). A possible explanation for a linguistic advantage would be that the grammar was explicitly created to mimic predictive dependencies and word categories common in human languages: in the original design, Saffran³¹ argued that learning constraints should be tailored to the stimuli for effective learning, thus, different constraints might be advantageous for learning linguistic and nonlinguistic stimuli, as different chunking and grouping mechanisms might operate in these domains (e.g., different constraints for linguistic structures versus musical structure in the auditory modality, and for symbol sequences versus complex real-life visual scenes in the visual domain). This type of structure with predictive dependencies and word categories, characteristic of language, might be optimal for learning linguistic materials. A further potential explanation of the linguistic over nonlinguistic advantage is that participants, although they are not instructed to do so, might also apply explicit memorization strategies for linguistic materials (e.g., rehearsal of sequences) which are less available for other types of stimuli.

However, the presence of the domain effect in the auditory conditions draws attention to the potential influence of stimulus specific factors beyond general effects in statistical learning. In the auditory nonlinguistic condition, the use of musical tones may give rise to musical features like contours (the ascending and descending pattern between tones) and intervals (the relative pitch change between tones)⁷⁵, which might support or be in conflict with grammatical information. Indeed, the linguistic advantage observed in the auditory modality was challenged by further analyses suggesting that lower performance in the nonlinguistic condition might have been caused by conflicting grammatical and musical patterns. Therefore, also in line with the result of no linguistic advantage in the visual modality, our results do not support general domain effects in statistical learning: the efficiency of learning may depend on more stimulus-specific features. Stimulus- and task specific learning effects are not surprising, since statistical information is not the only cue to finding structure in environmental stimuli. In cases of contradicting cues, other sources of information may override it (see e.g. prosody over statistics: Johnson and Seidl⁷³; familiar units over statistical cues: Poulin and colleagues⁷⁴), although in other cases, learners may rely on statistical features over other information types (statistical cues over similarity: Tillman and McAdams⁷⁹).

To summarize, we found an advantage of statistical learning in auditory linguistic conditions compared to all other conditions, including visual linguistic learning. In addition, performance in the auditory nonlinguistic condition was weaker than in other conditions. These results show that the effectiveness of statistical learning may be influenced by the domain of learning (e.g. linguistic versus nonlinguistic). However, in our study this domain effect was confounded with other emergent patterns in the stimuli: musical patterns (contours and intervals) in tone sequences were in conflict with statistical patterns defined by the grammar, making learning in the auditory nonlinguistic conditions more difficult than learning sequences of syllables. Further studies are needed to clarify the nature of and control for such effects and their interaction with domain and modality. These results suggest that instead of global domain effects, stimulus specific effects shape statistical learning which may also depend on task type, design and features of the learning material.

Training effects

To examine the influence of input characteristics on statistical learning, we also explored training effects across different modalities and domains. We hypothesized that starting small would facilitate the acquisition of the category based grammar through enabling the generation of simple and flexible hypotheses about the underlying rules. In contrast, we expected starting big to yield lower learning performance due to less effective hypothesis generation. However, we only found an effect of training with serial presentation in the visual modality: here, regardless of stimulus domain (i.e. both in the linguistic and nonlinguistic conditions), starting small training had an adverse effect on performance, while starting big training facilitated learning. This pattern of results for different training types suggests that the way of stimulus presentation can affect statistical learning in important and perhaps modality and domain-dependent ways. The visual processing system seems to be optimized for spatial rather than temporal processing⁴⁹, and the starting big presentation might compensate for the insufficient availability of information in the serial presentation.

The above pattern of results is in contrast with earlier findings about the starting small effect in visual statistical learning showing enhanced acquisition of structure with starting small training and simultaneous presentation in the visual modality both in the linguistic⁵³ and the nonlinguistic domain³⁵. These contradictory findings may be explained by differences in the grammars: previous studies applied recursive grammars in which the structure was based on the non-adjacent combination of item pairs. Thus, the initial acquisition of adjacent pairs of these legal combinations is essential, and increasingly more difficult when embedded in longer sequences: the complexity (the number of different sequences the grammar can generate) of recursive grammar sentences exponentially increases as a function of length⁵³. Starting small training targets this problem with presenting pairs with just the two adjacent items in the beginning. However, the grammar that we used is different in structure. Here, complexity does not increase with sentence length as much as in the case of recursive grammars. Poletiek and colleagues⁵³ argued that the key to the starting small effect is the presentation of less complex, and not necessarily shorter, sequences. As a result, the acquisition of this type of grammar may not profit as much from starting small training. However, statistical properties of ‘small’ phrases were not controlled for, and post-hoc analyses of these regularities do not show systematic differences. Shorter sequences with less complex statistical regularities than the longer ones might yield larger benefits in starting small: this would be a design worthy of implementation in a future study.

A further reason for the absence of the starting small effect might be that shorter sequences induce explicit rule search strategies which decrease the efficiency of learning complex statistical patterns^71,72. It is also possible that we have not provided sufficient information in the beginning of training in the starting small presentation conditions of our study for beneficial effects. Given the variability of items within phrases, the training might have been too short for participants to acquire these basic units and a longer training with ‘small’ phrases might have resulted in stronger or more explicit representations, which might have then served better as building blocks in later parts of the training with more complex material. To summarize, as training effects might significantly depend on grammar or structure type, further studies are needed to determine their scope. A larger sample size would also benefit exploring training effects further, as post-hoc comparisons in the Modality*Training Type interaction were not powered enough to unequivocally show either the presence or the absence of a difference.

Considerations about pattern and stimuli characteristics

Statistical learning is an umbrella term covering the acquisition of several types of patterns and systems, for instance, segmenting words from a speech stream^9,11,70, learning regularities in real-world scenes²⁷, spatial locations^13,14, acquiring visual patterns and faces^6,7,28,54, or learning musical systems^87,88. Even in the case of learning sequential information in artificial grammar learning tasks, the structure to be acquired is highly variable: phrasal³¹, finite-state⁵⁵, central-embedded or right-branching recursive^35,38, and non-adjacent dependency grammars⁵⁶ are all applied. The literature on the effects of modality³, presentation^30,31, domain^31,70 and training effects^{25,26,27,28,29,30,31,32,32,33,34,35,36,37,38,39,41,42}, and more broadly, input properties e.g.,^{79,83,84,85,86} also relies on results from statistical learning studies working with a large variety of structure types. Although we used a category-based artificial grammar consisting of predictive dependencies in the present study, we aimed to explore domain, modality and training effects on statistical learning in general. Our results extend and confirm previous findings from different tasks and stimulus sets on modality, domain, presentation and training effects in statistical learning. At the same time, contradicting findings from tasks with different statistical structure types (e.g., while Saffran³¹ found no linguistic advantage in an artificial grammar task, Hoch, Tyler and Tillmann⁷⁰ found that learning linguistic materials was more successful than learning nonlinguistic materials in a segmentation paradigm) draw the attention to a possible interaction of input characteristics and structure type, which should be addressed by future studies.

Beside structure type, stimulus type is also an underexamined, yet significant factor in statistical learning⁵⁷. Linguistic stimuli can take many forms in different modalities and different constraints may apply in learning from speech streams versus written texts versus gesture sequences. On the other hand, in the nonlinguistic domain, various types of musical and environmental sounds can be used as auditory stimuli, while for the visual modality, applied stimuli range from colorful squares through spatial locations to complex symbols, all organized by potentially different statistical constraints. The constraints for optimal acquisition might be specific not only to modality and/or domain, but to stimulus type as well. Previous results also suggest that learning efficiency for different stimulus types interacts with age, as well: Raviv and Arnon⁶⁷ and Shufaniya and Arnon⁶⁸ found different developmental trajectories during childhood for different stimulus types in statistical learning. Further studies should explore such specificity in statistical learning: investigating modality, domain and training effects with a diverse set of structure and stimulus types in different ages is an important future direction.

Methodological and psychometric limitations

One of the limitations of our study is a general methodological problem that many statistical learning studies face: we only measured learning offline, that is, after the learning phase. This post hoc measurement is problematic from multiple aspects. First, this way, we cannot gain information about the process and dynamics of learning. Second, as a consequence, we measure knowledge only at retrieval, which is a different process from encoding (for a discussion of implications for statistical learning, see⁵⁸). This is especially important when modality- and domain specific effects are in the focus, as their encoding and retrieval processes might differ^59,60. Third, the typically applied offline forced-choice tests recruit cognitive abilities distinct from statistical learning, for instance, decision-making and working memory processes^61,62. Individual variations in these abilities might also make the measurement of statistical learning noisy and unreliable. A potential solution to these pitfalls is relying on online measurements: for instance, measuring reaction times to one or more predictable items during the training allows to infer changes in the efficiency of processing and predicting items in the pattern. This can be then applied as a measure of statistical learning^{13,14,62,63,64,65}.

There are also psychometric aspects to be considered in future testing. Offline forced choice tasks often apply a relatively low number of trials. However, in a task type where group performance is just slightly different from the chance level most of the time, on the individual level, above-chance performance is difficult to distinguish from chance-level performance⁶⁶. In our case, with a mean score of 0.62 yielded from the 24 trials in the two-alternative forced choice task, there is an 8% chance that an individual performed above chance merely by accident based on the binomial distribution. This is even more likely in conditions where mean performances were lower. (However, increasing the number of test trials, and thus participants’ exposure to ungrammatical sequences, may weaken or alter acquired statistical representations. This effect could be minimized by including ungrammatical trials without any systematic statistical biases or controlled for by applying statistical methods which include trial order as a random factor.) Including trials with systematically varying difficulty would also make a better targeted method, as participants with different levels of knowledge could be more accurately tested. Thus, increasing the number and variability of trials would make results less noisy and more reliable, resulting in a better statistical learning task.

Finally, it is also a limitation to be addressed by future studies that we did not collect any information on backgrounds in musical training for the participants. In the serial non-linguistic condition, tone sequences created short melodies, which participants with a musical training might have found easier to process. Since more general beneficial effects of musical training have been reported for memory and learning^80,81,82, controlling for effects of musical training on performance would be relevant not just for the statistical learning of tone sequences, but for other modalities and domains as well.

Conclusions

The present study demonstrates that the efficiency of the acquisition of statistical structures may show considerable differences depending on the specific modality, domain, and presentation type. Most importantly, our findings show the advantage of sequential learning of auditory linguistic stimuli over other modalities and domains. Moreover, when grammar-based and musical features were matched in the nonlinguistic auditory condition, similar levels of performance were reached as in the linguistic auditory condition. This indicates the presence of constraints in statistical learning: serial presentation with this type of sequential structure with predictive dependencies and abstract categories might be optimal for learning auditory stimuli, while other stimulus types might profit more from other structure varieties. Our results also suggest that optimal, that is, simultaneous presentation type can boost learning performance in the visual modality. However, we found no general training effect in the present study, which indicates that training effects might also depend on the specific structure to be acquired. Our findings show that learning is constrained by the modality and presentation type together with the specific stimulus characteristics of the input, and call for broadening the scope of research by testing input effects on statistical learning with a wider range of structure and stimulus types.

Data availability

Study materials and data of the experiment are available at https://osf.io/hzg7w/?view_only=639eb1b8b8dd4325a7a5ce25c08cee9b.

References

Frost, R., Armstrong, B. C., Siegelman, N. & Christiansen, M. H. Domain generality versus modality specificity: The paradox of statistical learning. Trends Cogn. Sci. 19(3), 117–125. https://doi.org/10.1016/j.tics.2014.12.010 (2015).
Article PubMed PubMed Central Google Scholar
Bulf, H., Johnson, S. P. & Valenza, E. Visual statistical learning in the newborn infant. Cognition 121(1), 127–132. https://doi.org/10.1016/j.cognition.2011.06.010 (2011).
Article PubMed Google Scholar
Conway, C. M. & Christiansen, M. H. Modality-constrained statistical learning of tactile, visual, and auditory sequences. J. Exp. Psychol. Learn. Mem. Cogn. 31(1), 24–39. https://doi.org/10.1037/0278-7393.31.1.24 (2005).
Article PubMed Google Scholar
Creel, S. C., Newport, E. L. & Aslin, R. N. Distant melodies: Statistical learning of nonadjacent dependencies in tone sequences. J. Exp. Psychol. Learn. Mem. Cogn. 30(5), 1119–1130. https://doi.org/10.1037/0278-7393.30.5.1119 (2004).
Article PubMed Google Scholar
Endress, A. D. Learning melodies from non-adjacent tones. Acta Physiol. (Oxf.) 135(2), 182–190. https://doi.org/10.1016/j.actpsy.2010.06.005 (2010).
Article Google Scholar
Fiser, J. & Aslin, R. N. Statistical learning of higher-order temporal structure from visual shape sequences. J. Exp. Psychol. Learn. Mem. Cogn. 28(3), 458–467. https://doi.org/10.1037/0278-7393.28.3.458 (2002).
Article PubMed Google Scholar
Fiser, J. & Aslin, R. N. Statistical learning of new visual feature combinations by infants. Proc. Natl. Acad. Sci. 99(24), 15822–15826. https://doi.org/10.1073/pnas.232472899 (2002).
Article PubMed PubMed Central CAS Google Scholar
Kirkham, N. Z., Slemmer, J. A. & Johnson, S. P. Visual statistical learning in infancy: Evidence for a domain general learning mechanism. Cognition 83(2), B35–B42. https://doi.org/10.1016/S0010-0277(02)00004-5 (2002).
Article PubMed Google Scholar
Saffran, J. R., Aslin, R. N. & Newport, E. L. Statistical learning by 8-month-old infants. Science 274(5294), 1926–1928. https://doi.org/10.1126/science.274.5294.1926 (1996).
Article PubMed CAS Google Scholar
Saffran, J. R., Johnson, E. K., Aslin, R. N. & Newport, E. L. Statistical learning of tone sequences by human infants and adults. Cognition 70(1), 27–52. https://doi.org/10.1016/S0010-0277(98)00075-4 (1999).
Article PubMed CAS Google Scholar
Saffran, J. R., Newport, E. L. & Aslin, R. N. Word segmentation: The role of distributional cues. J. Mem. Lang. 35(4), 606–621. https://doi.org/10.1006/jmla.1996.0032 (1996).
Article Google Scholar
Teinonen, T., Fellman, V., Näätänen, R., Alku, P. & Huotilainen, M. Statistical language learning in neonates revealed by event-related brain potentials. BMC Neurosci. 10(1), 21. https://doi.org/10.1186/1471-2202-10-21 (2009).
Article PubMed PubMed Central Google Scholar
Howard, J. H., Howard, D. V., Dennis, N. A., Yankovich, H. & Vaidya, C. J. Implicit spatial contextual learning in healthy aging. Neuropsychology 18(1), 124–134. https://doi.org/10.1037/0894-4105.18.1.124 (2004).
Article PubMed PubMed Central Google Scholar
Nissen, M. J. & Bullemer, P. Attentional requirements of learning: Evidence from performance measures. Cogn. Psychol. 19(1), 1–32. https://doi.org/10.1016/0010-0285(87)90002-8 (1987).
Article Google Scholar
Christiansen, M. H., Conway, C. M. & Onnis, L. Similar neural correlates for language and sequential learning: Evidence from event-related brain potentials. Lang. Cognit. Process. 27(2), 231–256. https://doi.org/10.1080/01690965.2011.606666 (2012).
Article Google Scholar
Conway, C. M., Bauernschmidt, A., Huang, S. S. & Pisoni, D. B. Implicit statistical learning in language processing: Word predictability is the key. Cognition 114, 356–371. https://doi.org/10.1016/j.cognition.2009.10.009 (2010).
Article PubMed Google Scholar
Conway, C. M., Karpicke, J. & Pisoni, D. B. Contribution of implicit sequence learning to spoken language processing: Some preliminary findings with normal-hearing adults. J. Deaf Stud. Deaf Educ. 12, 317–334. https://doi.org/10.1093/deafed/enm019 (2007).
Article PubMed Google Scholar
Daltrozzo, J. E. et al. Visual statistical learning is related to natural language ability in adults: An ERP study. Brain Lang. 166, 40–51. https://doi.org/10.1016/j.bandl.2016.12.005 (2017).
Article PubMed PubMed Central Google Scholar
Kidd, E. Implicit statistical learning is directly associated with the acquisition of syntax. Dev. Psychol. 48(1), 171–184. https://doi.org/10.1037/a0025405 (2012).
Article PubMed Google Scholar
Kidd, E. & Arciuli, J. Individual differences in statistical learning predict children’s comprehension of syntax. Child Dev. 87(1), 184–193. https://doi.org/10.1111/cdev.12461 (2016).
Article PubMed Google Scholar
Lany, J., Shoaib, A., Thompson, A. & Estes, K. G. Infant statistical-learning ability is related to real-time language processing. J. Child Lang. 45(2), 368–391. https://doi.org/10.1017/S0305000917000253 (2018).
Article PubMed Google Scholar
Mainela-Arnold, E. & Evans, J. L. Do statistical segmentation abilities predict lexical-phonological and lexical-semantic abilities in children with and without SLI?. J. Child Lang. 41(2), 327–351. https://doi.org/10.1017/S0305000912000736 (2014).
Article PubMed Google Scholar
Misyak, J. B. & Christiansen, M. H. statistical learning and language: An individual differences study: Individual differences in statistical learning. Lang. Learn. 62(1), 302–331. https://doi.org/10.1111/j.1467-9922.2010.00626.x (2012).
Article Google Scholar
Morgan-Short, K. et al. A view of the neural representation of second language syntax through artificial language learning under implicit contexts of exposure. Stud. Second. Lang. Acquis. 37(2), 383–419. https://doi.org/10.1017/S0272263115000030 (2015).
Article Google Scholar
Petersson, K.-M., Folia, V. & Hagoort, P. What artificial grammar learning reveals about the neurobiology of syntax. Brain Lang. 120(2), 83–95. https://doi.org/10.1016/j.bandl.2010.08.003 (2012).
Article PubMed Google Scholar
Daikoku, T. Neurophysiological markers of statistical learning in music and language: Hierarchy, entropy and uncertainty. Brain Sci. 8(6), 114. https://doi.org/10.3390/brainsci8060114 (2018).
Article PubMed PubMed Central Google Scholar
Brady, T. F. & Oliva, A. Statistical learning using real-world scenes: Extracting categorical regularities without conscious intent. Psychol. Sci. 19(7), 678–685. https://doi.org/10.1111/j.1467-9280.2008.02142.x (2008).
Article PubMed Google Scholar
Sigurdardottir, H. M. et al. Problems with visual statistical learning in developmental dyslexia. Sci. Rep. 7(1), 606. https://doi.org/10.1038/s41598-017-00554-5 (2017).
Article PubMed PubMed Central CAS Google Scholar
Conway, C. M. How does the brain learn environmental structure? Ten core principles for understanding the neurocognitive mechanisms of statistical learning. Neurosci. Biobehav. Rev. 112, 279–299. https://doi.org/10.1016/j.neubiorev.2020.01.032 (2020).
Article PubMed PubMed Central Google Scholar
Conway, C. M. & Christiansen, M. H. Seeing and hearing in space and time: Effects of modality and presentation rate on implicit statistical learning. Eur. J. Cogn. Psychol. 21(4), 561–580. https://doi.org/10.1080/09541440802097951 (2009).
Article Google Scholar
Saffran, J. R. Constraints on statistical language learning. J. Mem. Lang. 47(1), 172–196. https://doi.org/10.1006/jmla.2001.2839 (2002).
Article Google Scholar
Elman, J. L. Learning and development in neural networks: The importance of starting small. Cognition 48(1), 71–99. https://doi.org/10.1016/0010-0277(93)90058-4 (1993).
Article PubMed CAS Google Scholar
Newport, E. L. Maturational constraints on language learning. Cogn. Sci. 14(1), 11–28. https://doi.org/10.1016/0364-0213(90)90024-Q (1990).
Article Google Scholar
Ambrus, G. G. et al. When less is more: Enhanced statistical learning of non-adjacent dependencies after disruption of bilateral DLPFC. J. Mem. Lang. 114, 104144 (2020).
Article Google Scholar
Conway, C. M., Ellefson, M. R., & Christiansen, M. H. When less is less and when less is more: Starting small with staged input. In Proceedings of the Annual Meeting of the Cognitive Science Society, vol. 25 25 (2003).
Kemény, F. & Lukács, Á. Statisztikai tanulás és kicsiben kezdés specifikus nyelvfejlődési zavarban. Általán. Nyelvész. Tanulmányok 29, 339–359 (2017).
Google Scholar
Kersten, A. W. & Earles, J. L. Less really is more for adults learning a miniature artificial language. J. Mem. Lang. 44(2), 250–273. https://doi.org/10.1006/jmla.2000.2751 (2001).
Article Google Scholar
Lai, J. & Poletiek, F. H. How “small” is “starting small” for learning hierarchical centre-embedded structures?. J. Cogn. Psychol. 25(4), 423–435. https://doi.org/10.1080/20445911.2013.779247 (2013).
Article Google Scholar
Lukács, Á. & Kemény, F. Development of different forms of skill learning throughout the lifespan. Cogn. Sci. 39(2), 383–404. https://doi.org/10.1111/cogs.12143 (2015).
Article PubMed Google Scholar
Cochran, B. P., McDonald, J. L. & Parault, S. J. Too smart for their own good: The disadvantage of a superior processing capacity for adult language learners. J. Mem. Lang. 41(1), 30–58. https://doi.org/10.1006/jmla.1999.2633 (1999).
Article Google Scholar
Ludden, D., & Gupta, P. Zen in the art of language acquisition: Statistical learning and the less is more hypothesis. In Proceedings of the Annual Meeting of the Cognitive Science Society, vol. 22 22 (2000).
Siegelman, N. & Arnon, I. The advantage of starting big: Learning from unsegmented input facilitates mastery of grammatical gender in an artificial language. J. Mem. Lang. 85, 60–75. https://doi.org/10.1016/j.jml.2015.07.003 (2015).
Article Google Scholar
Chater, N. & Manning, C. D. Probabilistic models of language processing and acquisition. Trends Cogn. Sci. 10(7), 335–344. https://doi.org/10.1016/j.tics.2006.05.006 (2006).
Article PubMed Google Scholar
Wickham, H. ggplot2: Elegant Graphics for Data Analysis (Springer, 2016). https://ggplot2.tidyverse.org.
Crowder, R. G. Auditory and temporal factors in the modality effect. J. Exp. Psychol. Learn. Mem. Cogn. 12(2), 268. https://doi.org/10.1037/0278-7393.12.2.268 (1986).
Article PubMed CAS Google Scholar
Engle, R. W. & Mobley, L. A. The modality effect: What happens in long-term memory?. J. Verbal Learn. Verbal Behav. 15(5), 519–527. https://doi.org/10.1016/0022-5371(76)90046-3 (1976).
Article Google Scholar
Glenberg, A. M. & Fernandez, A. Evidence for auditory temporal distinctiveness: Modality effects in order and frequency judgments. J. Exp. Psychol. Learn. Mem. Cogn. 14(4), 728. https://doi.org/10.1037/0278-7393.14.4.728 (1988).
Article PubMed CAS Google Scholar
Kemény, F. & Lukács, Á. Stimulus dependence in probabilistic category learning. Acta Physiol. (Oxf.) 143(1), 58–64. https://doi.org/10.1016/j.actpsy.2013.02.008 (2013).
Article Google Scholar
Freides, D. Human information processing and sensory modality: Cross-modal functions, information complexity, memory, and deficit. Psychol. Bull. 81(5), 284. https://doi.org/10.1037/h0036331 (1974).
Article PubMed CAS Google Scholar
Mahar, D., Mackenzie, B. & McNicol, D. Modality-specific differences in the processing of spatially, temporally, and spatiotemporally distributed information. Perception 23(11), 1369–1386. https://doi.org/10.1068/p231369 (1994).
Article PubMed CAS Google Scholar
Metcalfe, J., Glavanov, D. & Murdock, M. Spatial and temporal processing in the auditory and visual modalities. Mem. Cogn. 9(4), 351–359. https://doi.org/10.3758/BF03197559 (1981).
Article CAS Google Scholar
Repp, B. H. & Penel, A. Auditory dominance in temporal processing: New evidence from synchronization with simultaneous visual and auditory sequences. J. Exp. Psychol. Hum. Percept. Perform. 28(5), 1085. https://doi.org/10.1037/0096-1523.28.5.1085 (2002).
Article PubMed Google Scholar
Poletiek, F. H. et al. Under what conditions can recursion be learned? Effects of starting small in artificial grammar learning of center-embedded structure. Cogn. Sci. 42(8), 2855–2889. https://doi.org/10.1111/cogs.12685 (2018).
Article PubMed PubMed Central Google Scholar
Fiser, J. & Aslin, R. N. Unsupervised statistical learning of higher-order spatial structures from visual scenes. Psychol. Sci. 12(6), 499–504. https://doi.org/10.1111/1467-9280.00392 (2001).
Article PubMed CAS Google Scholar
Schiff, R. & Katan, P. Does complexity matter? Meta-analysis of learner performance in artificial grammar tasks. Front. Psychol. 5, 1084. https://doi.org/10.3389/fpsyg.2014.01084 (2014).
Article PubMed PubMed Central Google Scholar
Wilson, B. et al. Non-adjacent dependency learning in humans and other animals. Top. Cogn. Sci. 12(3), 843–858. https://doi.org/10.1111/tops.12381 (2020).
Article PubMed Google Scholar
Chang, G. Y. & Knowlton, B. J. Visual feature learning in artificial grammar classification. J. Exp. Psychol. Learn. Mem. Cogn. 30(3), 714. https://doi.org/10.1037/0278-7393.30.3.714 (2004).
Article PubMed Google Scholar
Batterink, L. J. & Paller, K. A. Online neural monitoring of statistical learning. Cortex 90, 31–45. https://doi.org/10.1016/j.cortex.2017.02.004 (2017).
Article PubMed PubMed Central Google Scholar
Glenberg, A. M. A retrieval account of the long-term modality effect. J. Exp. Psychol. Learn. Mem. Cogn. 10(1), 16. https://doi.org/10.1037/0278-7393.10.1.16 (1984).
Article PubMed CAS Google Scholar
Wilkinson, A. M., Hall, A. C. & Hogan, E. E. Effects of retrieval practice and presentation modality on verbal learning: Testing the limits of the testing effect. Memory 27(8), 1144–1157. https://doi.org/10.1080/09658211.2019.1632349 (2019).
Article PubMed Google Scholar
Isbilen, E. S., McCauley, S. M., Kidd, E., & Christiansen, M. H. Testing statistical learning implicitly: A novel chunk-based measure of statistical learning. In The 39th Annual Conference of the Cognitive Science Society (CogSci 2017) 564–569 (Cognitive Science Society, 2017).
Lukics, K. S. & Lukács, Á. Tracking statistical learning online: Word segmentation in a target detection task. Acta Physiol. (Oxf.) 215, 103271. https://doi.org/10.1016/j.actpsy.2021.103271 (2021).
Article Google Scholar
Lammertink, I., Van Witteloostuijn, M., Boersma, P., Wijnen, F. & Rispens, J. Auditory statistical learning in children: Novel insights from an online measure. Appl. Psycholinguist. 40(2), 279–302. https://doi.org/10.1017/S0142716418000577 (2019).
Article Google Scholar
Siegelman, N., Bogaerts, L., Kronenfeld, O. & Frost, R. Redefining “learning” in statistical learning: What does an online measure reveal about the assimilation of visual regularities?. Cogn. Sci. 42, 692–727. https://doi.org/10.1111/cogs.12556 (2017).
Article PubMed PubMed Central Google Scholar
Lammertink, I., Boersma, P., Wijnen, F. & Rispens, J. Children with developmental language disorder have an auditory verbal statistical learning deficit: Evidence from an online measure. Lang. Learn. 70(1), 137–178. https://doi.org/10.1111/lang.12373 (2020).
Article Google Scholar
Siegelman, N., Bogaerts, L. & Frost, R. Measuring individual differences in statistical learning: Current pitfalls and possible solutions. Behav. Res. Methods 49(2), 418–432. https://doi.org/10.3758/s13428-016-0719-z (2017).
Article PubMed PubMed Central Google Scholar
Raviv, L. & Arnon, I. The developmental trajectory of children’s auditory and visual statistical learning abilities: Modality-based differences in the effect of age. Dev. Sci. 21(4), e12593 (2018).
Article PubMed Google Scholar
Shufaniya, A. & Arnon, I. Statistical learning is not age-invariant during childhood: Performance improves with age across modality. Cogn. Sci. 42(8), 3100–3115 (2018).
Article PubMed Google Scholar
Silva, S., Folia, V., Inácio, F., Castro, S. L. & Petersson, K. M. Modality effects in implicit artificial grammar learning: An EEG study. Brain Res. 1687, 50–59 (2018).
Article PubMed CAS Google Scholar
Hoch, L., Tyler, M. D. & Tillmann, B. Regularity of unit length boosts statistical learning in verbal and nonverbal artificial languages. Psychon. Bull. Rev. 20(1), 142–147 (2013).
Article PubMed CAS Google Scholar
Howard, D. V. & Howard, J. H. When it does hurt to try: Adult age differences in the effects of instructions on implicit pattern learning. Psychon. Bull. Rev. 8(4), 798–805 (2001).
Article MathSciNet PubMed CAS Google Scholar
Fletcher, P. C. et al. On the benefits of not trying: Brain activity and connectivity reflecting the interactions of explicit and implicit sequence learning. Cereb. Cortex 15(7), 1002–1015 (2005).
Article PubMed CAS Google Scholar
Johnson, E. K. & Seidl, A. Clause segmentation by 6-month-old infants: A crosslinguistic perspective. Infancy 13(5), 440–455 (2008).
Article Google Scholar
Poulin-Charronnat, B., Perruchet, P., Tillmann, B. & Peereman, R. Familiar units prevail over statistical cues in word segmentation. Psychol. Res. 81(5), 990–1003 (2017).
Article PubMed Google Scholar
Talamini, F. et al. Auditory and visual short-term memory: Influence of material type, contour, and musical expertise. Psychol. Res. 86(2), 421–442 (2022).
Article PubMed Google Scholar
Bertels, J., Destrebecqz, A. & Franco, A. Interacting effects of instructions and presentation rate on visual statistical learning. Front. Psychol. 6, 1806 (2015).
Article PubMed PubMed Central Google Scholar
Emberson, L. L., Conway, C. M. & Christiansen, M. H. Timing is everything: Changes in presentation rate have opposite effects on auditory and visual implicit statistical learning. Q. J. Exp. Psychol. 64(5), 1021–1040 (2011).
Article Google Scholar
JASP Team. In JASP (Version 0.15.0.0) [Computer software] (2022).
Tillmann, B. & McAdams, S. Implicit learning of musical timbre sequences: Statistical regularities confronted with acoustical (dis) similarities. J. Exp. Psychol. Learn. Mem. Cogn. 30(5), 1131 (2004).
Article PubMed Google Scholar
Talamini, F., Altoè, G., Carretti, B. & Grassi, M. Musicians have better memory than nonmusicians: A meta-analysis. PLoS ONE 12(10), e0186773 (2017).
Article PubMed PubMed Central Google Scholar
Francois, C. & Schön, D. Musical expertise boosts implicit learning of both musical and linguistic structures. Cereb. Cortex 21(10), 2357–2365 (2011).
Article PubMed Google Scholar
François, C., Chobert, J., Besson, M. & Schön, D. Music training for the development of speech segmentation. Cereb. Cortex 23(9), 2038–2043 (2013).
Article PubMed Google Scholar
Hay, J. F. & Saffran, J. R. Rhythmic grouping biases constrain infant statistical learning. Infancy 17(6), 610–641 (2012).
Article PubMed PubMed Central Google Scholar
Johnson, E. K. & Jusczyk, P. W. Word segmentation by 8-month-olds: When speech cues count more than statistics. J. Mem. Lang. 44(4), 548–567 (2001).
Article Google Scholar
Perruchet, P. & Tillmann, B. Exploiting multiple sources of information in learning an artificial language: Human data and modeling. Cogn. Sci. 34(2), 255–285 (2010).
Article PubMed Google Scholar
Loui, P. New music system reveals spectral contribution to statistical learning. Cognition 224, 105071 (2022).
Article PubMed Google Scholar
Rohrmeier, M., Rebuschat, P. & Cross, I. Incidental and online learning of melodic structure. Conscious. Cogn. 20(2), 214–222 (2011).
Article PubMed Google Scholar
Loui, P., Wessel, D. L. & Kam, C. L. H. Humans rapidly learn grammatical structure in a new musical scale. Music. Percept. 27(5), 377–388 (2010).
Article PubMed PubMed Central Google Scholar

Download references

Acknowledgements

This work was supported by the Momentum Research Grant of the Hungarian Academy of Sciences (Momentum 96233 'Profiling learning mechanisms and learners: individual differences from impairments to excellence in statistical learning and in language acquisition', PI: Ágnes Lukács). The authors are grateful to Dezső Németh for his useful comments on the manuscript.

Funding

Open access funding provided by Budapest University of Technology and Economics.

Author information

Authors and Affiliations

Department of Cognitive Science, Budapest University of Technology and Economics, Műegyetem rkp. 3., H-1111, Budapest, Hungary
Krisztina Sára Lukics & Ágnes Lukács
MTA-BME Momentum Language Acquisition Research Group, Eötvös Loránd Research Network (ELKH), Budapest, Hungary
Krisztina Sára Lukics & Ágnes Lukács

Authors

Krisztina Sára Lukics
View author publications
Search author on:PubMed Google Scholar
Ágnes Lukács
View author publications
Search author on:PubMed Google Scholar

Contributions

K.S.L.: data collection; data analysis; writing; A.L.: conceptualization; methodology; writing.

Corresponding author

Correspondence to Krisztina Sára Lukics.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Supplementary Information 1.

Supplementary Information 2.

Supplementary Information 3.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Lukics, K.S., Lukács, Á. Modality, presentation, domain and training effects in statistical learning. Sci Rep 12, 20878 (2022). https://doi.org/10.1038/s41598-022-24951-7

Download citation

Received: 05 May 2022
Accepted: 22 November 2022
Published: 03 December 2022
DOI: https://doi.org/10.1038/s41598-022-24951-7

This article is cited by

Addressing the theory crisis in statistical learning research
- Christopher M. Conway
- Holly E. Jenkins
- Benjamin Wilson
npj Science of Learning (2025)