Abstract
Speech perception consistency refers to the similarity between a listener’s responses to repeated presentations of the same speech sound (e.g., along a /ba/–/pa/ continuum). Although research has demonstrated multiple advantages of consistency, its role in initial lexical activation (i.e., early activation of word candidates) and speech perception flexibility (e.g., the ability to recover from misleading speech input) has not been tested. We investigated the role of consistency in spoken-word recognition among Spanish (L1)–English (L2) bilingual listeners. Focusing on the bilabial stop contrast (/b/–/p/), consistency was measured in a task where participants repeatedly rated speech sounds on a continuous scale (Visual Analog Scale), whereas initial lexical activation and speech perception flexibility were assessed using a word-to-picture matching task with eye-tracking (Visual World Paradigm). Seventy Spanish–English bilinguals completed these tasks in both languages. Listeners with higher consistency exhibited stronger initial lexical activation for acoustically compatible word candidates across /b/ and /p/ in L1 Spanish and for /p/ in L2 English. However, consistency was not associated with speech perception flexibility. These findings suggest that speech perception consistency primarily supports early lexical activation during spoken-word recognition, while playing a more limited role in later processes involved in revising initial misinterpretations.
Data availability
The datasets generated during and/or analyzed during the current study are available in the OSF repository, [https://doi.org/10.17605/OSF.IO/QXMYK](https:/doi.org/10.17605/OSF.IO/QXMYK).
References
Lisker, L. & Abramson, A. S. A cross-language study of voicing in initial stops: Acoustical measurements. Word 20, 384–422 (1964).
Liberman, A. M., Cooper, F. S., Shankweiler, D. P. & Studdert-Kennedy, M. Perception of the speech code. Psychol. Rev. 74, 431–461 (1967).
Miller, J. L., Green, K. P. & Reeves, A. Speaking rate and segments: A look at the relation between speech production and speech perception for the voicing contrast. Phonetica 43, 106–115 (1986).
Allen, J. S. & Miller, J. L. Listener sensitivity to individual talker differences in voice-onset-time. J. Acoust. Soc. Am. 115, 3171–3183 (2004).
Baese-Berk, M. Perceptual learning for native and non-native speech. In The Psychology of Learning and Motivation: Current Topics in Language Vol. 68 (eds Federmeier, K. D. & Watson, D. G.) 1–29 (Elsevier Academic Press, 2018).
Symons, A., Jasmin, K. & Tierney, A. Speech perception strategies shift instantly. Cognition 266, 106299 (2026).
Samuel, A. G. & Kraljic, T. Perceptual learning for speech. Atten. Percept. Psychophys. 71, 1207–1218 (2009).
Sjerps, M. J. & McQueen, J. M. The bounds on flexibility in speech perception. J. Exp. Psychol. Hum. Percept. Perform. 36, 195–211 (2010).
Goldinger, S. D. Echoes of echoes? An episodic theory of lexical access. Psychol. Rev. 105, 251–279 (1998).
McClelland, J. L. & Elman, J. L. The TRACE model of speech perception. Cogn. Psychol. 18, 1–86 (1986).
Kapnoula, E. C., Winn, M. B., Kong, E. J., Edwards, J. & McMurray, B. Evaluating the sources and functions of gradiency in phoneme categorization: An individual differences approach. J. Exp. Psychol. Hum. Percept. Perform. 43, 1594–1611 (2017).
Kim, H., Tomblin, B. & McMurray, B. Speech categorization consistency predicts general language abilities. PsyArXiv (2025).
Fuhrmeister, P. & Myers, E. B. Structural neural correlates of individual differences in categorical perception. Brain Lang. 215, 104919 (2021).
Fuhrmeister, P., Phillips, M. C., McCoach, D. B. & Myers, E. B. Relationships between native and non-native speech perception. J. Exp. Psychol. Learn. Mem. Cogn. 49, 1161–1175 (2023).
Honda, C. T., Clayards, M. & Baum, S. R. Exploring individual differences in native phonetic perception and their link to nonnative phonetic perception. J. Exp. Psychol. Hum. Percept. Perform. 50, 370–394 (2024).
Kim, H. et al. Speech categorization consistency is associated with language and reading abilities in school-age children: Implications for language and reading disorders. Cognition 263, 106194 (2025).
Kim, H., Kim, W. H., McMurray, B. & Yim, D. Speech categorization consistency predicts language and reading abilities in Korean school-age children. PsyArXiv (2025).
Honda, C. T., Clayards, M. & Baum, S. R. Individual differences in the consistency of neural and behavioural responses to speech sounds. Brain Res. 1845, 149208 (2024).
Myers, E., Phillips, M. & Skoe, E. Individual differences in the perception of phonetic category structure predict speech-in-noise performance. J. Acoust. Soc. Am. 156, 1707–1719 (2024).
Rizzi, R. & Bidelman, G. M. Consistency in phonetic categorization predicts successful speech-in-noise perception. JASA Express Letters 5, 125203 (2025).
Kim, H., Lee, J. & Yang, T. H. Individual differences in speech categorization and perceptual cue reliance across phonological contrasts. PsyArXiv (2025).
Kutlu, E. Gradiency in speech categorization and its relation to speech intelligibility. J. Acoust. Soc. Am. 157, A169 (2025).
Wong, B. W. L., Samuel, A. G. & Kapnoula, E. C. The role of speech perception gradiency in L1 versus L2 spoken word recognition. J. Exp. Psychol. Hum. Percept. Perform. (2026).
Tanenhaus, M. K., Spivey-Knowlton, M. J., Eberhard, K. M. & Sedivy, J. C. Integration of visual and linguistic information in spoken language comprehension. Science 268, 1632–1634 (1995).
McMurray, B., Tanenhaus, M. K. & Aslin, R. N. Within-category VOT affects recovery from lexical garden-paths: Evidence against phoneme-level inhibition. J. Mem. Lang. 60, 65–91 (2009).
Souganidis, C., Molinaro, N. & Stoehr, A. Bilinguals produce language-specific voice onset time in two true-voicing languages: The case of Basque–Spanish early bilinguals. Linguist. Approach. Biling. 14, 370–399 (2024).
Allen, J. S. & Miller, J. L. Effects of syllable-initial voicing and speaking rate on the temporal characteristics of monosyllabic words. J. Acoust. Soc. Am. 106, 2031–2039 (1999).
Kapnoula, E. C. & Samuel, A. G. Sensitivity to subphonemic differences in first language predicts vocabulary size in a foreign language. Lang. Learn. 74, 950–984 (2024).
Llama, R., Cardoso, W. & Collins, L. The influence of language distance and language status on the acquisition of L3 phonology. Int. J. Multiling. 7, 39–57 (2010).
Farris-Trimble, A., McMurray, B., Cigrand, N. & Tomblin, J. B. The process of spoken word recognition in the face of signal degradation. J. Exp. Psychol. Hum. Percept. Perform. 40, 308–327 (2014).
McMurray, B., Farris-Trimble, A. & Rigler, H. Waiting for lexical access: Cochlear implants or severely degraded input lead listeners to process speech less incrementally. Cognition 169, 147–164 (2017).
Smith, F. X. & McMurray, B. Lexical access changes based on listener needs: Real-time word recognition in continuous speech in cochlear implant users. Ear Hear. 43, 1487–1501 (2022).
McMurray, B., Apfelbaum, K. S. & Tomblin, J. B. The slow development of real-time processing: Spoken-word recognition as a crucible for new thinking about language acquisition and language disorders. Curr. Dir. Psychol. Sci. 31, 305–315 (2022).
Kim, H., Muegge, J. B. & McMurray, B. What are you still waiting for? Real-time lexical access requires encapsulated auditory memory. PsyArXiv (2025).
Carraturo, S., Kim, H., Kutlu, E. & McMurray, B. Lexical knowledge enhances consistency in speech categorization. PsyArXiv (2025).
Flege, J. E. & Bohn, O. S. The revised speech learning model (SLM-r). In Second Language Speech Learning: Theoretical and Empirical Progress (ed. R. Wayland) 3–83 (Cambridge Univ. Press, 2021).
Kapnoula, E. C., Edwards, J. & McMurray, B. Gradient activation of speech categories facilitates listeners’ recovery from lexical garden-paths, but not perception of speech-in-noise. J. Exp. Psychol. Hum. Percept. Perform. 47, 578–595 (2021).
Kim, H., McMurray, B., Sorensen, E. & Oleson, J. The consistency of categorization-consistency in speech perception. Psychon. Bull. Rev. 32, 1–13 (2025).
Kutlu, E., Kim, H. & McMurray, B. Longitudinal changes in the structure of speech categorization across school-age years: Children become more gradient and more consistent. Dev. Sci. 29, e70085 (2026).
Winn, M. B. Manipulation of voice onset time in speech stimuli: A tutorial and flexible Praat script. J. Acoust. Soc. Am. 147, 852–866 (2020).
Andruski, J. E., Blumstein, S. E. & Burton, M. The effect of subphonetic differences on lexical access. Cognition 52, 163–187 (1994).
Duñabeitia, J. A. et al. MultiPic: A standardized set of 750 drawings with norms for six European languages. Q. J. Exp. Psychol. 71, 808–816 (2018).
de Bruin, A., Carreiras, M. & Duñabeitia, J. A. The BEST dataset of language proficiency. Front. Psychol. 8, 522 (2017).
de Leeuw, J. R., Gilbert, R. A. & Luchterhandt, B. jsPsych: Enabling an open-source collaborative ecosystem of behavioral experiments. J. Open Source Softw. 8, 5351 (2023).
McMurray, B. Nonlinear curvefitting for psycholinguistics. OSF (2017).
McMurray, B. EyelinkAnalysis. (2025). https://doi.org/10.17605/OSF.IO/C35TG
Hallett, P. E. & Thomas (eds) Eye movements. In Handbook of perception and human performance (eds Boff, K., Kaufman, L. & Thomas, J.) 10.11–10.112 Wiley, (1986).
Salverda, A. P., Kleinschmidt, D. & Tanenhaus, M. K. Immediate effects of anticipatory coarticulation in spoken-word recognition. J. Mem. Lang. 71, 145–163 (2014).
Bürkner, P. Bayesian item response modeling in R with brms and Stan. J. Stat. Softw. 100, 1–54 (2021).
R Core Team. R: A language and environment for statistical computing (R Foundation, 2023).
RStudio Team. RStudio: integrated development environment for R. RStudio, (2023).
Wickham, H. ggplot2: Elegant Graphics for Data Analysis (Springer, 2016).
Kass, R. E. & Raftery, A. E. Bayes factors. J. Am. Stat. Assoc. 90, 773–795 (1995).
Acknowledgements
We would like to acknowledge Amets Esnal and Itziar Basterra for organizing the data collection, Ainhoa Eguiguren for the translations of the task instructions, and Daphne Weiss and Elena Alguirrebengoa for recording the stimuli for the present study. We would also like to thank Hyoju Kim and an anonymous reviewer for their constructive comments on a previous version of the manuscript.
Funding
This research is supported by the Basque Government through the BERC 2022–2025 program and funded by the Spanish State Research Agency through BCBL Severo Ochoa excellence accreditation CEX2020-001010/AEI/https://doi.org/10.13039/501100011033. This research was supported by a predoctoral fellowship (grant PRE2021-097223, associated with project CEX2020-001010-S–21–2) funded by the Spanish Ministry of Science and Innovation (MCIN), the State Research Agency (AEI), and the European Social Fund Plus (FSE+), awarded to Brian W. L. Wong. Support for this project was provided by the Spanish Ministry of Science and Innovation through Grant # PID2020-113348GB-I00 and # PID2023-146423NB-I00, awarded to Arthur Samuel and Efthymia Kapnoula. This work was supported by the Ramón y Cajal Program, funded by MCIN/AEI/https://doi.org/10.13039/501100011033 and by the ESF+, under Grant # RYC2022-035505-I, awarded to Efthymia Kapnoula.
Author information
Authors and Affiliations
Contributions
Conceptualization: Brian W. L. Wong, Arthur G. Samuel, Efthymia C. Kapnoula; Methodology: B. W. L. W., A. G. S., E. C. K.; Investigation: B. W. L. W.; Software: B. W. L. W., E. C. K; Formal analysis: B. W. L. W., E. C. K; Writing - original draft preparation: B. W. L. W.; Writing - review and editing: B. W. L. W., A. G. S., E. C. K; Supervision: E. C. K, A. G. S.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Wong, B.W.L., Samuel, A.G. & Kapnoula, E.C. Speech perception consistency facilitates initial lexical activation, but not speech perception flexibility. Sci Rep (2026). https://doi.org/10.1038/s41598-026-47943-3
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41598-026-47943-3