The interplay of uncertainty, relevance and learning influences auditory categorization

Sheth, Janaki; Collina, Jared S.; Piasini, Eugenio; Kording, Konrad P.; Cohen, Yale E.; Geffen, Maria N.

doi:10.1038/s41598-025-86856-5

Download PDF

Article
Open access
Published: 27 January 2025

The interplay of uncertainty, relevance and learning influences auditory categorization

Janaki Sheth¹^na1,
Jared S. Collina²^na1,
Eugenio Piasini³,
Konrad P. Kording^4,5,
Yale E. Cohen^1,4,5^na1 &
…
Maria N. Geffen^1,4,6^na1

Scientific Reports volume 15, Article number: 3348 (2025) Cite this article

2273 Accesses
1 Citations
4 Altmetric
Metrics details

Subjects

Abstract

Auditory perception requires categorizing sound sequences, such as speech or music, into classes, such as syllables or notes. Auditory categorization depends not only on the acoustic waveform, but also on variability and uncertainty in how the listener perceives the sound – including sensory and stimulus uncertainty, the listener’s estimated relevance of the particular sound to the task, and their ability to learn the past statistics of the acoustic environment. Whereas these factors have been studied in isolation, whether and how these factors interact to shape categorization remains unknown. Here, we measured human participants’ performance on a multi-tone categorization task and modeled each participant’s behavior using a Bayesian framework. Task-relevant tones contributed more to category choice than task-irrelevant tones, confirming that participants combined information about sensory features with task relevance. Conversely, participants’ poor estimates of task-relevant tones or high-sensory uncertainty adversely impacted category choice. Learning the statistics of sound category over both short and long timescales also affected decisions, biasing the decisions toward the overrepresented category. The magnitude of this effect correlated inversely with participants’ relevance estimates. Our results demonstrate that individual participants idiosyncratically weigh sensory uncertainty, task relevance, and statistics over both short and long timescales, providing a novel understanding of and a computational framework for how sensory decisions are made under several simultaneous behavioral demands.

Expectation-driven sensory adaptations support enhanced acuity during categorical perception

Article 13 March 2025

Exaggerated perception of change with greater sensory imprecision

Article Open access 11 August 2025

Individual-specific strategies inform category learning

Article Open access 23 January 2025

Introduction

Making sensory decisions in everyday settings is a complex task due to the presence of multiple forms of uncertainty^{1,2,3,4,5,6,7,8}. One of many such decisions is identifying what a friend said when the conversation takes place in a crowded restaurant versus in a quiet room. Despite the apparent ease by which we can accomplish this task, it is a very complicated computational process involving many factors. First, the sensory transduction process and neural-processing stages are noisy, generating a form of uncertainty called sensory uncertainty^9,10,11,12. A second form of uncertainty is stimulus relevance: of all stimuli within a sensory scene, only some are relevant to the current decision^{13,14,15,16,17,18}. For example, when trying to identify a particular sound uttered by our friend, we have to disregard sounds uttered by other speakers, which can be considered to be distractors. A third form of uncertainty is stimulus uncertainty, which is a listener’s uncertainty in estimating the variability in a sound source’s generation of the nominal same stimulus¹⁹. Even once our sensory system isolates the signals that were produced by our friend from background noise or irrelevant signals, we may be uncertain as to what specific sound or word was produced. Furthermore, when presented with any form of uncertainty, observers may rely on the learned past statistics of stimuli^{20,21,22,23,24,25,26} to make their decisions, such as their long-term knowledge of how their friend typically talks or their short-term knowledge of their friend’s recent speech sounds²⁷. Indeed, observers can learn both short- ^28,29 and long-term stimulus statistics^30,31,32 and this learning is considered crucial for efficient decision making^33,34, with or without a conscious effort from the listener.

Whereas the effects of the factors – sensory uncertainty, stimulus relevance, stimulus uncertainty and short- and long-term statistics – on sensory decision-making, such as categorization, have previously been studied separately^19,35, in everyday situations we often confront them simultaneously. Thus, these factors may interact with each other, differentially affecting category decisions. For example, sensory uncertainty may not only decrease our ability to categorize a sound uttered by our friend but may also affect our estimate of the relevant versus the distractor sounds. Our prior expectations of what our friend is saying may play a greater role when our sensory uncertainty or stimulus uncertainty is high or when there are more distracting sounds in the environment by other speakers. These expectations may be reduced if we are talking to a person we just met.

To test whether and how the interplay of these factors shapes sensory decisions, we measured the performance of human participants on an auditory categorization task^36,37,38,39. More specifically, our goals were to test how observers’ sensory uncertainty and associated estimates of the likelihood of the irrelevant stimuli mediated their sensitivity to stimulus relevance, and how their task performance was affected by learning. Participants categorized trial sequences of three tones as either high frequency or low frequency in a two-alternative forced choice task, as we varied the number of task-irrelevant (distractor) tones and task-relevant (category-specific) tones across the trials. Additionally, to test learning, in a subset of experimental sessions, we modulated the proportion of trials biased toward the high or the low category.

Because it has been suggested that perceptual systems optimally integrate information from relevant inputs while ignoring the irrelevant ones⁴⁰, we characterized participants’ behavior using normative Bayesian models of sensory decision-making^{41,42,43,44,45,46}. The Bayesian formalism is a well-established framework to understand complex interactions among many factors governing decision-making^{13,14,30,31,47}. On average, when categorizing auditory stimuli, participants integrated information across all three tones but weighed them by their task relevance. Stimulus uncertainty was similar across participants. However, individual participants’ sensitivity to task relevance depended on their internal sensory uncertainty and subjective estimates of the likelihood of irrelevant stimuli. We also found population-wide evidence of both learning short-term statistics (i.e., stimulus category of the previous trial) and learning long-term statistics (i.e., session-level bias toward high versus low category). However, individual participants’ knowledge of short- and long-term statistics was inversely correlated with their estimates of the likelihood of irrelevant stimuli. Together, our results demonstrate that humans differentially combine information about sensory uncertainty, stimulus relevance, stimulus uncertainty, and short-term and long-term learning, providing an integrated framework for understanding how the interplay of different mechanisms shapes auditory processing.

Results

Participants weigh tones according to stimulus relevance

Auditory categorization can depend on multiple factors, such as sensory uncertainty, stimulus relevance, stimulus uncertainty, and stimulus statistics that can vary over both short and long timescales. Because these factors often co-occur, they may interact with each other to affect decision-making and considering their relations they, indeed, computationally should interact. Here, we designed a categorization task to measure how these factors interacted in decision-making. In a two-alternative forced-choice auditory task, participants were asked to report the category of a three-tone sequence (Fig. 1A). Each trial was randomly selected to be from low or high category, and each tone of the trial could probabilistically either be a signal or distractor (Fig. 1B-D).

We first tested whether participants accounted for relevance of individual tones when categorizing the tone sequences. Because the total number of tones was the same during each trial, when a signal tone was replaced by a distractor tone, it decreased the amount of available information to the listener regarding stimulus category and added irrelevant information. To account for this, we analyzed trials separately with the same number of distractors across individuals (Fig. 2A,B). Compared to signal tones, distractors should have less of an impact on participants’ decisions, because they have no relevance for the task. As expected, participant’s category choice probability was more strongly correlated with the frequency of the presented signal tones compared to the frequency of the distractor tones (Fig. 2A; two-sided paired t-test, trials with one distractor: p = 1.32e-22; trials with two distractors: p = 5.36e-18; Figs S1C-H and Table 1). This confirms that participants used tone relevance when making decisions.

Table 1 Statistical comparisons.

Full size table

Conversely, because distractor tones may still be integrated in the decision with some probability^48,49, they may impair participants’ performance by providing information that should have been ignored. Indeed, as the number of distractors increased, the correlation between distractor tones and category choice, which we interpret as the influence of distractor tones on category choice, also increased whereas that of signal tones decreased (Fig. 2A; Figs S1B-D; one-way rmANOVA, significant main effect for number of distractors on correlations with signal tone frequencies: F(2,110) = 34.55, p = 2.28e-12, and on correlations with distractor tone frequencies: F(1,55) = 24.36, p = 8e-6). Additionally, accuracy decreased as the number of distractors increased (Fig. 2B; N = 56, correlation of accuracy with the number of distractors: Spearman, ρ = -1, p = 0). These findings collectively demonstrate that because participants were uncertain about the relevance of the different tones, the distractors adversely impacted their decisions.

Next, we tested whether the effect of distractors depended on their tone frequency. We hypothesized that a distractor tone whose frequency was in the same frequency range as the trial’s generative category center would impair participants’ decisions less than a distractor tone whose frequency was not in the same frequency range. We evaluated this prediction by computing the average participant accuracy for trials with one distractor tone. Category accuracy for these trials was significantly higher when the frequency of the distractor was similar to the trial category (Fig. 2C; Table 1 – Wilcoxon signed-rank test). For trials in which the distractor frequency was different than the trial category, we plotted separate psychometric curves corresponding to the signal or distractor tone frequencies (Fig S1J). A qualitative comparison of these curves to the psychometric curve for trials with only signal tones (Fig S1B) indicates that distractors have a substantial effect on participants’ category choice. Combined, these data suggest that participants’ ability to differentiate between signal and distractor tones depends on the frequency value of the distractors.

Participants exhibit high individual variability in performance

Whereas the previous analysis confirmed the expected effects of tone relevance on the mean performance of the participants, the extent to which tone relevance impacts choice varies idiosyncratically across individuals. This diversity may be due to participants’ distinct internal priors, estimates of relevance, measure of stimulus uncertainty or their sensory uncertainty. Indeed, there was significant heterogeneity in the correlation of tone frequency with category choice probability (for both signal and distractor tones, Figs. 3A, B) and with accuracy (across trials with one and two distractors, Fig. 3C). Accuracy for trials with one distractor tone ranged from 57%—89% and for trials with two distractor tones ranged from 53%—75%. Thus, in subsequent analyses, we not only quantified the effects of relevance, stimulus uncertainty, sensory uncertainty, and learning of stimulus statistics on the average performance across the population but also on inter-participant variability. This allowed us to probe how these different real-life factors might mediate individual participant’s decision-making process.

Bayesian model can account for participant performance

So far, we have demonstrated in a model-free way that when making decisions, participants across the population accounted for the relevance of the different tone frequencies as they integrated information from the three tones. Next, to investigate how these factors shaped individual participant’s category decisions, we modeled each participant’s behavior using a Bayesian approach. We first discuss how tone frequency should affect individual category choice probability and then lay out three models for understanding decisions and their predictions for this relation.

On a standard two alternative forced choice (2AFC) frequency categorization task with no distractor tones, tone frequencies at the intersection of the two gaussian distributions do not provide evidence for either category. As the tone frequencies move further away from the decision boundary, tones become more informative about the trial’s category (Figs. 4A, B) and consequently, a participant’s accuracy increases the further the tone frequencies are from the decision boundary. Thus, in the absence of distractors, if a trial consisted of a single tone, a participant’s choice probability would be modeled by a standard psychometric curve, which is a monotonically increasing saturating sigmoid. In contrast, if the trial consisted of multiple tones, choice probability could be modeled as the product of the relevant sigmoidal single-tone psychometric curves.

However, in the presence of distractors, we expect participants’ performance to be substantially different. Because tones whose frequencies are far from the category-specific Gaussians are more likely to be distractors than signal tones (Figs. 4C, D), participants may assign lower relevance to those tones. This effect would result in poorer accuracy for tones lying at the tails of the Gaussian distributions (Fig. 4D).

We developed a general framework to capture potential strategies that a participant could use when categorizing the three-tone sequences ^41,50,51,52. These strategies depend on a participant’s internal model of stimulus relevance of the three tone frequencies and can be formalized using the Bayesian approach (Fig. 5A). The full Bayesian model reflects a decision-making strategy for a participant (Fig. 5A table, top). This model estimates the participant’s belief about the relevance of each of the three tones and integrates information across the tones accordingly. It also captures the participant’s prior beliefs of the trial categories. As alternative models, we considered two reduced variants of this model. In the no-distractor model, the participant assumed that all three tones were ‘signal’ (Fig. 5A table, center). Conversely, in the random-guess model, the participant assumed that all three tones were irrelevant ‘distractors’ and made their choice randomly akin to a coin flip based on only their learned prior beliefs (Fig. 5A table, bottom). This model controls for the possibility that participants ignored all the tones when they made their decision. Henceforth, we interpret our experimental findings using this three-model Bayesian framework.

We first examined the performance of a representative participant with high accuracy on trials with distractors (Fig. 5B, one-distractor accuracy 80.59% and two-distractor accuracy 74.14%). Tones that were far from the two category distributions and thus more likely to be distractors were correlated less strongly with behavioral performance than tones closer to the middle of the distributions (Fig. 5B, black). Additionally, the full Bayesian model (Fig. 5B, orange) fit the participant’s nonmonotonic behavior considerably better than the no-distractor or random-guess models (Fig. 5B, teal and magenta, respectively; Table 1 – two-sided Wilcoxon signed-rank test). These analyses collectively demonstrated that this participant used the relevance of the different tones when categorizing trials.

We next studied the performance of two participants with intermediate and low accuracy. Their choice probability for tones drawn from the flanks (highest or lowest frequencies) of the signal distributions changed less in comparison to the participant with higher accuracy, resulting in psychometric curves that were more sigmoidal in nature (Figs. 5C, D). Furthermore, the difference in accuracy on trials with one and two distractors was lower for the low-accuracy than high-accuracy participants (Fig. 5C: one-distractor accuracy 78.24% and two-distractor accuracy 59.48%, Fig. 5D: one-distractor accuracy 59% and two-distractor accuracy 61.21%). Even for these participants, although the effect was smaller, the full Bayesian model provided a better fit to the psychometric curves compared to the other two Bayesian models (orange curve versus the teal and magenta curves; Table 1 – two-sided Wilcoxon signed-rank test).

These three examples suggest that potentially all the participants, irrespective of their accuracy, are impacted by stimulus relevance when choosing the stimulus category, albeit to varying degrees. To test this hypothesis across the entire population, we plotted the average psychometric curve and the average model fits (Fig. 5E). We found that there the data were well fit by the full Bayesian model, which qualitatively illustrates a population-wide effect of stimulus relevance on category choice decisions.

Next, we quantified this effect by comparing the statistics for the fits of the Bayesian models. Because the number of fitting parameters differs across the three models (6 in the full Bayesian, 5 in the no-distractor, and 2 in the random-guess model (see Methods and Fig S5)), we used Bayesian Information Criterion (BIC) scores to compare them. A lower BIC score indicates a better model fit. For all participants, the full Bayesian model was better than the random-guess model (Fig. 6A; population statistics: two-sided Wilcoxon signed-rank test on BIC, Z = -6.51, p = 7.55e-11) at describing categorization behavior. This implies that participants do not simply rely on their prior beliefs when classifying the tone sequences.

For most participants, except for two, the full Bayesian model was also better than the no-distractor model (Fig. 6B; same test, Z = -6.49, p = 8.41e-11). For the two outlier participants (Fig. 6B, C; light pink), the full Bayesian and no-distractor model fits were similar (Table 1 – two-sided Wilcoxon signed-rank tests), indicating that they did not necessarily account for relevance when making decisions. However, for the rest, the full Bayesian model reproduced the main patterns in the data in an absolute sense because it accurately captured participants’ responses across tone frequencies (Figs. 5B-E, Figs S4A, D, G). These results confirm that nearly all participants took stimulus relevance into account when making category decisions.

Finally, we correlated model fits to participants’ task accuracy. Across the population, the difference in BICs of the full Bayesian and the no-distractor models was inversely correlated with accuracy (Fig. 6C, Spearman, ρ = -0.85, p = 8.53e-17). For highly accurate participants, the full Bayesian model performed particularly well. These data, once again, suggest that participants’ behavioral performance was determined by their ability to discriminate tones based on their decision relevance.

We also explored the space of non-Bayesian models that had fewer assumptions. For the first model, we implemented a simple boundary model that had three parameters: (1) the frequency dividing the low distractor range from the low-signal range, (2) the frequency dividing the low and high signal ranges, and (3) the frequency dividing the high signal range from the high-distractor range. We found that this model was informative for the high-accuracy participants and consistent with the full Bayesian model (Figs S2C, D), but was not informative for the other participants.

For the high-accuracy participants, this boundary model provided insight into the extent to which they discounted extreme stimuli. We found that these participants exaggerate the signal-frequency ranges by underestimating the lower signal boundary (separating distractor from low signal) and overestimating the upper signal boundary (separating high signal from distractor) (Fig S2A). This matches the full Bayesian fitting results, which also show a tendency to exaggerate the distance of the perceived generative signal distributions from the center of the stimulus range (Figs S5B, C).

For the second model, we implemented a generalized linear model (GLM) (Figs S3A, E). We found that this model was consistent with the full Bayesian model for high-accuracy participants. The observation that high- and intermediate-accuracy participants are impacted by stimulus relevance in their decision-making process was substantiated by the corresponding boundary models and GLMs (Figs S2D, E, S3F, G). The GLM analysis was also able to substantiate the finding that the difference in accuracy on trials with one and two distractors was lower for the low-accuracy than high-accuracy participants (Fig S4).

However, the boundary model and GLM differed in their ability to capture the behavior of low-accuracy participants. For low-accuracy participants, the boundary model assigned internal boundaries to these participants that caused most of the stimulus range to be labeled as distractor tones (Fig S2B). In contrast, the GLM results indicate that even these participants understood the generative distributions (Fig S3F), which is backed by the full Bayesian results showing that these participants were still best fit by the full Bayesian model. (Figs. 5D, S3C). Although the GLM was able to describe the behavior of low-accuracy participants, it was not as informative as the Bayesian models regarding the interpretation of the parameters (Figs S3E-G). Together, these results indicate that individual variability in performance is not captured by the boundary and GLM models which primarily are driven by representations of the generative distributions. Therefore, other factors likely account for individual variability in performance.

Sensory uncertainty and estimated task relevance underlie inter-participant variability

Individual participants’ ability to discriminate between generative distributions of the signal and distractor tones may depend not only on their subjective estimates of the tones’ task relevance (or more specifically, their estimate of the probability of distractors) but also on their measure of stimulus uncertainty (which is the standard deviation of the two Gaussian distributions) and their sensory uncertainty. For instance, some participants may have different prior convictions about the number of distractors (e.g., some may spend more time in noisy environments than others), which could bias how they integrate information over the three tones. Additionally, we expect that participants with low stimulus and sensory uncertainty would distinguish better between signal and distractor tone frequencies than participants with high uncertainty. A participant’s sensory uncertainty determines the variability in perception of the same tone frequency over multiple presentations, whereas the stimulus uncertainty captures the participant’s estimate of the variability in the underlying source generating the same stimulus. Thus, a participant with low stimulus and sensory uncertainties would perceive repeated presentations of the same tone more consistently and would estimate the distribution of the low (high) tone source to be narrower than a participant with high stimulus and sensory uncertainties. Consequently, we conjectured that estimated priors for the probability of distractors, stimulus uncertainty, and sensory uncertainty would drive individual participants’ category choice and task accuracy contributing to inter-individual variability.

To test the effect of these three factors on category choice, we first analyzed the inter-participant variability in these factors. We found that across the 56 participants, stimulus uncertainty (denoted by model parameter $\sigma$) was nearly constant (Fig S5D), whereas sensory uncertainty (denoted by parameter ${\sigma }_{sensory}$) and the estimation for the probability of distractors (denoted by model parameter ${p}_{distractor}$) were participant-specific (Figs S5A,E,F). To quantify a relationship between ${p}_{distractor}$ and ${\sigma }_{sensory}$, we defined a metric ‘sigmoidicity’ (Fig. 7A; see Methods) which quantitatively captured participant’s category decisions across all trials. We found that this metric was significantly influenced by ${p}_{distractor}$ and ${\sigma }_{sensory}$ as well as by the interaction between these two parameters (Fig. 7B; Table 1 – two-factor regression model). Specifically, sigmoidicity increased with ${\sigma }_{sensory }$ and decreased with ${p}_{distractor}$. In other words, sigmoidicity could be used to quantify the extent to which the participants considered the presence of distractors when categorizing.

We then tested the effect of ${p}_{distractor}$ and ${\sigma }_{sensory}$, on participants’ accuracy. Although we did not find a relationship between accuracy and ${p}_{distractor}$ (Fig. 7C; Table 1 – two-factor regression model), we found a strong, roughly linear, relationship between accuracy and ${\sigma }_{sensory}$ (Spearman, ρ = -0.96, p = 3.15e-31). Taken together, these data demonstrate that participants had a similar measure of stimulus uncertainty but differed in their sensory uncertainty and estimate of relevance. Further, although participants’ category decisions were shaped by both their estimate for the probability of distractors and their sensory uncertainty, much of their differences in accuracy may be driven by differences in their ability to distinguish between the tone frequencies.

Next, we tested whether listeners differentially weighed tones based on their position in the sequence when making their category decision. To test this idea, we developed an alternative Bayesian model with a nonuniform psychophysical kernel in which we substituted ${p}_{distractor}$ for 3 tone-position dependent versions of ${p}_{distractor}$. We found that participants weighed the first two tone positions (${p}_{distracto{r}_{position1}}$: Mdn = 0.7, IQR = 0.24; ${p}_{distracto{r}_{position2}}$: Mdn = 0.7, IQR = 0.28) equally but slightly more than the third tone position (${p}_{distracto{r}_{position3}}$: Mdn = 0.85, IQR = 0.17; Fig S6G; Table 1 – one-way rmANOVA, post hoc two-sided Wilcoxon signed-rank test) when making their category decision. Additionally, values for ${p}_{distracto{r}_{position3}}$ but not ${p}_{distracto{r}_{position1}}$ and ${p}_{distracto{r}_{position2}}$ were statistically different than ${p}_{distractor}$ (Fig S6G; Mdn = 0.59, IQR = 0.43; Table 1 – two-sided Wilcoxon signed-rank test). The influence of all three position-dependent weights on sigmoidicity and accuracy was consistent with results (Figs. S6A-F; Table 1 – Spearman) from the full Bayesian model (Figs. 7B, C), in that both ${p}_{distractor}$ and ${\sigma }_{sensory}$ were correlated with sigmoidicity, while ${\sigma }_{sensory},$ but not ${p}_{distractor}$ , was correlated with accuracy. The finding that the third tone position is discounted relative to the first two tones is also supported by results from the simpler GLM model (Fig. S6H; Table 1 – one-way rmANOVA, post hoc two-sided Wilcoxon signed-rank test). We found that this alternative tone location-dependent Bayesian model outperformed the location-independent model, likely because it was able to capture this discounting of the tone in the third position (two-sided Wilcoxon signed-rank test on BIC, T = 0, p = 7.55e-11). Overall, these data show that participants may have used information from all three tone positions but weighed the third tone somewhat less than the first two.

Using this differential weighting of the tone positions, we reconstructed the psychometric curves (Fig. 8) for the representative participants shown in Fig. 5. This reconstruction accounts for a participant’s accumulation of evidence across the three tone positions. Interestingly, we found that although the full Bayesian model was simpler, both models adequately fit the experimental psychometric curves and for both the models the underlying p_distractor values were higher than the true experimental value of 0.3. Participants thus had a tendency to ignore tones more than the statistics of the task would require.

Participants rely on past statistics of category probabilities

Having established how stimulus uncertainty, sensory uncertainty and stimulus relevance shape auditory categorization, we next asked whether and how participants’ learning of past statistics of the category probabilities informed their current decision. We first analyzed the influence of short-term statistics by computing for unbiased sessions the mean probability of a $\widehat{\text{High}}$ category choice response for a given trial ($t)$ conditioned on the category of the previous trial ($t-1$), $\text{p}(\widehat{{\text{H}}_{t}}|{\text{H}}_{t-1})$ and $\text{p}\left(\widehat{{\text{H}}_{t}}|{\text{L}}_{t-1}\right)$ (Fig. 9A). We found that the difference between the two curves was quite variable across participants, suggesting that although some participants’ decisions were substantially affected by the short-term history of category probabilities (Figs. 9C, D), others’ decisions were not (Fig. 9B). Overall, across the population, the category of the previous trial (i.e., short-term stimulus statistics) significantly influenced participants’ choices on the current trial (Fig. 9E; ${\Delta }_{\text{short}-\text{term}}:$ difference between $\text{p}(\widehat{{\text{H}}_{t}}|{\text{H}}_{t-1})$ and $\text{p}\left(\widehat{{\text{H}}_{t}}|{\text{L}}_{t-1}\right)$ at central frequency of 511 Hz; central frequency: frequency which is the mid-point of the experimental frequency range in log-space; two-sided Wilcoxon signed-rank test on ${\Delta }_{\text{short}-\text{term}}$, Z = 6.04, p = 1.58e-9). Moreover, we found that this influence was only restricted to the category of the previous trial and did not substantially depend on whether the participant responded accurately to the trial (Fig S7; Table 1 – two-sided Wilcoxon signed-rank test on ${\Delta }_{\text{CorrectVsIncorrect}}$). Combined, these data confirm the presence of category-driven short-term learning in decision-making.

In conjunction with short-term learning, participants may also use long-term learning when there is a bias in the trial categories toward low or high choices over a longer time scale. To test whether participants were affected by short- and/or long-term learning, in separate behavioral sessions, we changed the probability of the low-category trials. Specifically,${p}_{L}=0.7$ in biased low sessions and ${p}_{L}=0.3$ in biased high sessions (Fig. 11A; Participants: N = 53 in biased low and N = 48 jointly in biased low and biased high). Anecdotally, individual participants with high values of ${p}_{distractor}$ did not use either type of learning when making decisions (Fig. 10B; Fig S8A). However, participants with lower values of ${p}_{distractor}$ seemed to use previous trial statistics to varying degrees (Figs. 10C, D). Some participants used both short- and long-term learning (Fig. 10C; Fig S8B), whereas others primarily used long-term learning (Fig. 10D; Fig S8C). Overall, in the population, there was a clear effect of both the session type and the last trial on category choice (Fig. 10E, Figs S7-9; ${\Delta }_{\text{long}-\text{term}}$: difference between psychometric curves from biased low and biased high sessions at central frequency; population statistics: two-sided Wilcoxon signed-rank test on ${\Delta }_{\text{long}-\text{term}}$, Z = 8.63, p = 6.34e-18; two-sided Wilcoxon signed-rank test on ${\Delta }_{\text{short}-\text{term}}$, biased low: Z = 5.94, p = 2.91e-9; biased high: Z = 5.94, p = 2.91e-9). Thus, data across the three sessions – unbiased, biased low and biased high – suggests that both, short- and long-term learning influenced participants’ decision-making, presumably depending on their relevance estimates.

To further quantify how learning modulates category decision-making, we adapted the full Bayesian model to incorporate both short- (exponential weighted: ${W}_{1}{e}^{-t/\tau }$) and long-term (constant: ${W}_{constant}$) components (Values of ${W}_{1}$ and ${W}_{constant}\sim$ 0 implies that there was not any learning, whereas values ~ 1 implies learning; $\tau$ is a time constant that captures the number of preceding trials that a participant may have used to learn the short-term stimulus statistics). This adapted model, thus, combined the effects of sensory uncertainty, stimulus relevance, stimulus uncertainty, as well as learning of long- and short-term stimulus statistics.

Using this framework, we asked how participants measure of stimulus relevance and their sensory uncertainty modulated their learning of stimulus statistics. Thus, we first tested whether participants accounted for stimulus relevance when categorizing trials during biased low and high sessions. We found that the performance of most participants, except for two, was best fit by the full Bayesian model (Fig S10). These two outlier participants primarily used their learned prior and their performance was thus best fit by the random-guess model (Figs S10A, B, D, E). These data reinforce our previous finding that stimulus relevance is central to categorization.

Next, we assessed the degree to which participants learned the short-term statistics of trials by fitting the adapted version of the full Bayesian model to data from the unbiased session. We found that nearly all participants demonstrated significant short-term learning (Fig. 11A; ${W}_{1}>0$ for N = 55). However, there was substantial variability across participants such that for some participants, ${W}_{1}\sim 0$, whereas for other participants, it was $\sim 1$. Similar heterogeneity was also observed in the time constant for learning such that the participants’ time constant was inversely correlated with their accuracy (Fig. 11B; Spearman, ρ = -0.27, p = 0.042). However, across the population, the average time constant ($\tau$) in the unbiased session was $\sim$ 1 trial (Mdn: 0.63 trial, IQR: 0.44; Figs. 11A, B). The nature of ${W}_{1}$ (Fig. 11C) and $\tau$ (Fig S11; Table 1 – Spearman correlation with accuracy) was similar in the biased sessions. Jointly, these results indicate that although most participants show signs of short-term learning when making category decisions, this learning was largely restricted to the immediately preceding trial.

Similar to short-term learning, we also expected some participants to exhibit effects of long-term learning. Indeed, most participants learned the long-term trial statistics (Fig. 11C), but this learning was not correlated to their short-term learning (Spearman; Table 1). Thus, participants appear to use both the bias in the session and their knowledge of previous trials albeit to different degrees when making decisions.

Next, we probed how these two learning mechanisms relate to participants’ estimated probability of distractors. We found that the weight of the preceding trial (${W}_{1}{e}^{-1/\tau }$) was inversely correlated with ${p}_{distractor}$ for all three sessions (Figs. 11D, E; Spearman, unbiased: ρ = -0.36, p = 0.0069, biased low and high: ρ = -0.47, p = 1e-6). Similarly, long-term bias (${W}_{constant}$) was also inversely correlated with ${p}_{distractor}$(Fig. 11F; Spearman, biased low and high: ρ = -0.25, p = 0.013). This indicates that participants’ measure of relevance uncertainty was inversely correlated with their stimulus-category knowledge derived over both short- and long-time scales. This suggests that participants who are poor at differentiating between signal versus distractor tones likely rely on other sources of information such as past knowledge when making category choices.

Finally, we asked how the various parameters that drive performance – sensory uncertainty, stimulus relevance, stimulus uncertainty, and learning relate to task accuracy. We found that sensory uncertainty was a strong predictor of accuracy, accounting for much of the variance (Figs. 7C, S12; Spearman, biased low and high: ρ = -0.75, p = 2.38e-19). Thus, the main factor in deciding how proficiently people categorized sounds could be interpreted as how well they perceived the different tone frequencies. These results support the basic assumption of Bayesian models that sensory uncertainty is a fundamental factor in decision making.

Discussion

It has long been known that humans and other animals can isolate signals of interest from background in noisy, crowded environments. When presented with two competing auditory streams, participants’ neural responses to the task-relevant stimuli are enhanced^53,54, but, their processing of the irrelevant distractor stimuli is mediated by attention^55,56. Visual spatial and feature attention affects the neural representation of task-irrelevant sensory information^57,58. Furthermore, because such information is often incompletely suppressed, the presence of distractors impairs human decision-making behavior¹⁵. However, much of this work has studied the effects of relevance in isolation. When making decisions in noisy, uncertain environments, humans often incorporate multiple sources of information. One such source is the long-term sensory statistics of the stimuli of interest^30,59,60,61 – studies have found that sensitivity to these statistics introduces observable biases in perception. However, in addition to tuning to long-term regularities, our sensory representations often need to rapidly adapt to short-term changes⁶². Indeed, such adaptation underscores context-dependent speech perception and categorization²⁷. In our work, we investigated this complexity in perceptual decision-making by testing how participants’ sensory uncertainty, measure of stimulus relevance, stimulus uncertainty, and learning interacted to shape their auditory categorization behavior.

We found that participants differentially combined these factors when making category decisions. Specifically, when categorizing tone sequences consisting of both category-irrelevant (distractors) and relevant tones, participants weighed individual tones by their task relevance (Figs. 2,3; Figs S1-4,10). Additionally, nearly all participants had similar stimulus uncertainty but diverged in their sensory uncertainty and their estimates of stimulus relevance (Fig S5). In fact, participants’ psychometric curves and category choices were jointly determined by both their sensory uncertainty and their subjective estimate of stimulus relevance, whereas their accuracy was mostly driven by their sensory uncertainty (Fig. 7). We also observed that participants’ prior expectations of stimulus category were affected by their knowledge of stimulus statistics over both short and long timescales and this effect was inversely correlated with participants’ estimated probability of distractors (Figs. 9,10,11; Fig S7-10). Additionally, we treat stimulus uncertainty as an estimate that the participants make about the stimulus distribution. However, it is possible that some participants do not explicitly separate the estimates of the underlying stimulus distributions and internal noise.

Flexibility of experimental paradigm

Compared to previous work^30,63, our experimental paradigm is unique in that we could modulate the multiple factors of sensory uncertainty, stimulus relevance, stimulus uncertainty and learning all within the same stimulus sequence. For example, because our stimulus was a three-tone sequence and because we could manipulate the number of task-irrelevant distractors (Fig. 1), we could examine not only how participants accounted for relevance during categorization but also how their sensory uncertainty might have affected their associated measures of relevance (Figs. 2,5,6,7; Fig S1). Additionally, although humans use both short- and long-term knowledge to efficiently make decisions^26,33, most studies probing the influence of learned expectations on categorization have largely examined either short- or long-term effects without testing for their interaction^64,65,66. Because in part of our experiment we biased the tone sequences towards either the low or the high category, we could investigate how participants’ reliance on long-term stimulus statistics interacted with their learning of short-term statistics i.e., the effect of the previous trial’s category (Fig. 11). Thus, our experimental setup allowed us to simultaneously probe multiple factors that often underlie category decision-making. However, a potential shortcoming of our experimental design is that we did not vary stimulus uncertainty. Future studies could examine how this parameter interacts with all the other behavioral factors when making category decisions. In addition, we treat stimulus uncertainty as an estimate that the participants make about the stimulus distribution. However, it is possible that some participants do not explicitly separate the estimates of the underlying stimulus distributions and internal noise.

Bayesian approach to study variability and interplay of relevance and uncertainty

To quantify the interplay between all these factors, we leveraged a Bayesian model (Figs. 1,5,6,7,8,9,10,11). Bayesian models are a powerful tool that has been successfully used to study human behavior in sensorimotor learning^12,67, sensory perception^25,68, categorization^30,42 etc. Our Bayesian analysis allowed us to characterize how individual participants may vary in their categorization behaviors. Across all three sessions, performance of nearly all participants was best captured by the full Bayesian model (Figs. 6,11; Figs S4,10). However, in the unbiased session, few participants were found to be equally well fit by both the full Bayesian and the no-distractor models, suggesting that they did not take relevance into account when making category decisions. In the biased sessions, two participants who relied only on long-term stimulus statistics to make their decisions were best captured by the random-guesses model (Fig S10B, E).

In addition to this heterogeneity across strategies, we found that even among the participants best-captured by the full Bayesian model, there was substantial variability (Figs. 7,11; Figs S2-6,8) resulting from participants’ sensory uncertainty, their subjective measure of relevance, and their degree of short- and long-term learning. For instance, by modeling sensory uncertainty using the parameter ${\sigma }_{sensory}$, we found that although, on average, the value of participants’ sensory uncertainty is consistent with previous literature³⁰, some participants have low sensory uncertainty and high task accuracy whereas some others have substantially high sensory uncertainty and poor task accuracy (Fig. 7). This variability also underscored participants’ measure of stimulus relevance and their learning of stimulus statistics. In the latter, we observed that not all individuals used their prior information. Overall, these results illustrate diversity in the human perceptual decision-making process and follow prior studies that have elaborated upon individual differences in performance across several speech and non-speech tasks ^69,70. Additionally, because our fitting procedures for the Bayesian models suggest that the participants are likely veridical about the stimulus features, their variability in categorization behavior mostly results from the cumulative effects of internal processes such as determination of stimulus relevance, sensory uncertainty and reliance on short- and long-term learning. This finding is consistent with previous work that attributes individual differences in the cocktail-party effect to stimulus-independent, internal stochastic processes ^71,72.

Future work could use similar Bayesian approaches to probe whether in such auditory categorization tasks, participants use the same strategy throughout the experiment or switch between different strategies⁷³ based on the complexity of the trial. Further, we conjecture that given the interdependence of stimulus relevance and strategy, it is likely that our analysis is a simplification of human behavior, and that relevance may be a time-varying property of participants. This contrasts with sensory uncertainty, which is likely to be a static parameter intrinsic to a given participant.

Learning

The Bayesian analysis also allowed us to explore how the different forms of uncertainty – sensory uncertainty, stimulus relevance and stimulus uncertainty – interact with learning of stimulus statistics during categorization. We found that both short-term and long-term learning biased participants’ behavior in a manner consistent with previous studies ^27,30. Moreover, these effects were independent (Fig. 11) of each other⁶⁰, but they were both inversely correlated with the relevance parameter ${p}_{distractor}$. In other words, when making category decisions, participants combined their uncertainty about stimulus relevance with other sources of information⁷⁴. Additionally, because the influence of long-term learning in the experiment grew with time (Fig S9), participants may have adapted to the underlying stimulus statistics. Future work should investigate the neural mechanisms that drive the dynamics of this adaptation, over both short and long timescales, and how such adaptation may be altered by the presence of distractors.

Future directions

The interplay between multiple factors might also have implications for the broader understanding of how humans navigate real-life crowded noisy environments. For example, spatial cues^75,76 and attention^77,78,79 are integral to parsing multiple sound sources, but it is not yet well understood how humans can effortlessly identify relevant sounds in these noisy conditions. In our study, participants categorized three-tone sequences into high or low categories such that their category decisions were not greatly affected by tone positions (Fig S6). However, real-life stimuli such as speech⁸⁰ and musical chords have rich information both within and across sequences. Whether such structure can aid relevance discrimination remains untested. Furthermore, our results indicate that participants’ estimates of stimulus relevance varied inversely with their learning of stimulus statistics over different time scales. Future work should explore the role of relevance in real-life scenarios and whether prior knowledge of one’s environment may detract from identification of relevant sounds. By covering many aspects of the complex process of auditory category learning, our approach may also open new ways of studying the neural basis of audition. Short- and long-term learning^{81,82,83,84,85}, uncertainty in relevance of stimuli⁵⁷, uncertainty in category of trials⁸⁶ and sensory uncertainty may all be captured by distinct circuit elements and brain regions^87,88. As such, experiments that vary these factors in a model-driven framework can enable new approaches towards understanding neural computation in the auditory system.

Methods

Ethics statement

The experimental protocol was approved by the Institutional Review Board of the University of Pennsylvania. All procedures were carried out in accordance with the IRB guidelines. All participants participated voluntarily and provided informed consent in writing prior to participating.

Human psychophysics apparatus

We used the crowdsourcing online platform provided by Prolific to recruit participants. We collected participants’ consent and demographic data using Qualtrics and conducted all experiments using Pavlovia. All subjects reported having normal or corrected-to-normal hearing. Subjects were required to use a laptop or desktop computer to complete the experiment. Each participant had to pass a ‘headphone-check’ experiment that helped ensure that they were wearing headphones or earphones as instructed⁸⁹ before continuing with the experiment. This headphone-check required participants to judge which of three pure tones is the quietest and is designed to screen out non-headphone users due to phase-cancellation.

Stimulus creation and task design

Human participants performed an auditory categorization task in which they reported the frequency category ($\widehat{\text{High}}$ or $\widehat{\text{Low}}$) of a three-tone sequence (each tone is a sinusoid; tone duration: 300 ms during training, 280 ms during testing; inter-tone interval: 175 ms during training, 165 ms during testing; 44,100 Hz sampling rate; 20 ms on- and off-ramps). Longer duration and intervals were used during training to give the listener more information about the task. Once they had learned the task structure, these parameters were reduced to reduce overall experiment duration. Each of the three tones $v$ probabilistically could be either signal or distractor; that is a signal tone was relevant to the category membership of a sequence, whereas a distractor tone was irrelevant. The signal frequency probability distributions were Gaussian (${p}_{G}$; Eqs. 1a, b) in log(Hz) space: the mean of the low-frequency distribution (${\mu }_{L}$) was 2.55, the mean of the high-frequency distribution (${\mu }_{H}$) was 2.85, and standard deviation of both (${\sigma }_{expt}$) was 0.1. In Hz, the mean frequencies for the low and high distributions correspond to 355 Hz and 708 Hz respectively. The distractor frequency probability distribution was uniform (${p}_{U}$; Eq. 1c) and overlapped with both Gaussian distributions. For the distractor probability, the frequencies ranged from a = 90 Hz to b = 3000 Hz.

$${p}_{GL}=\frac{1}{\sqrt{2\pi }{\sigma }_{expt}}\text{exp}(-\frac{{\left(v-{\mu }_{L}\right)}^{2}}{2{\sigma }_{expt}^{2}})$$

(1a)

$${p}_{GH}=\frac{1}{\sqrt{2\pi }{\sigma }_{expt}}\text{exp}(-\frac{{\left(v-{\mu }_{H}\right)}^{2}}{2{\sigma }_{expt}^{2}}).$$

(1b)

$$p_{U} \; = \;\left\{ {\begin{array}{*{20}c} {\frac{1}{b - a}} & {{\text{for}}\;{\text{a}}\; \le \;v\; \le \;{\text{b}}} \\ 0 & {{\text{otherwise}}} \\ \end{array} } \right.\;$$

(1c)

Overall, a low (high) category sequence was a combination of low-frequency (high-frequency) signal tones and distractor tones. The complete frequency range of the tones was between 90 – 3000 Hz and was uniformly sampled in log 10 space such that the frequency of each tone was one of 30 possible values. Because of the discrete nature of the frequencies, the equations presented here are probabilities across the 30 possible frequency values rather than continuous probability density functions.

Participants completed three sessions of the experiment: unbiased, biased low, and biased high. Tone sequences in the unbiased sessions were equally likely to be drawn from either the high or low categories, i.e., ${p}_{L}=0.5$. Conversely, in the biased low (biased high) sessions, we overrepresented the corresponding distribution such that ${p}_{L}({p}_{H})=0.7$. Across all three sessions, the probability that each tone was signal was ${p}_{S}=0.7$ and of it being a distractor was ${p}_{D}=0.3$. This meant that, of the set of three tones, the probability that the set included just one distractor was 3(0.3)(1—0.3)(1—0.3) = 0.44. The signal and distractor tone probabilities were chosen based on pilot data internally collected. Specifically, for ${p}_{D}$, we tested values of 0.2, 0.3 and 0.4 and chose ${p}_{D}=0.3$ based on a tradeoff between participants’ accuracy and their self-reported difficulty of the experimental task.

The generative process of the experimental stimuli can be summarized as,

$$p\left({\varvec{v}}|C\right)= {\sum }_{{\varvec{r}}}p\left({\varvec{v}}|C,{\varvec{r}}\right)p\left({\varvec{r}}|C\right)$$

(2)

where ${\varvec{v}}$ is the set of three experimental tones frequencies ${v}_{1},{v}_{2},{v}_{3}$ (in units of log(Hz)). ${\varvec{r}}$ is the set of tone relevance indicators ${r}_{1},{r}_{2},{r}_{3}\in \{R, I\}$ such that ${r}_{i}$ = R implies that the tone ${v}_{i}$ is signal (relevant) and ${r}_{i}$ = I implies that it is a distractor (irrelevant). C is the category of the sequence (high ($H$) or low ($L$)). Because the tones and their relevance are conditionally independent, given a trial’s category, we can rewrite Eq. 2 as:

$$p\left({\varvec{v}}|C\right)={\sum }_{{\varvec{r}}}p\left({v}_{1}|C,{r}_{1}\right)p\left({v}_{2}|C,{r}_{2}\right)p\left({v}_{3}|C,{r}_{3}\right)p\left({r}_{1}\left|C\right)p({r}_{2}|C)p({r}_{3}|C\right).$$

(3)

For example, in a high-category sequence, for a given tone ${v}_{i}$,

$${\sum }_{{r}_{i}}p\left({v}_{i}|H,{r}_{i}\right)p\left({r}_{i}|H\right)=p\left({v}_{i}|H,{r}_{i}=R\right)p\left(R|H\right) +p\left({v}_{i}|H,{r}_{i}=I\right)p\left(I|H\right)$$

(4)

$$={\text{p}}_{GH}\left({v}_{i}|{\upmu }_{\text{H}},{\sigma }_{expt}\right){(1-\text{ p}}_{D})+{\text{p}}_{U}\left({v}_{i}\right){\text{p}}_{D}.$$

(5)

The priors in the generative process are $p\left(C=L\right)={p}_{L} \text{and} p\left(C=H\right)= {p}_{H}=1-{p}_{L}$.

Participants

In total, 48 healthy adults (22 female; mean ± SD age of the 48 adults: 26.32 ± 7.17) successfully completed the full study. Of the 169 adults who signed up for the study, 32 declined to participate after signing the consent forms, 72 failed the ‘headphone check’, and 9 others either only completed one of the biased sessions or did not exceed 70% accuracy on those trials that did not have distractors. We tested for this accuracy threshold during a participant’s first session, which ensured that the participants understood the basic task of categorizing three-tone sequences into two categories. All participants self-reported that they had normal hearing or hearing aids.

We had 3 study variations to randomly counterbalance the order of the session type across participants: (1) unbiased, biased low, and biased high (N = 16); (2) biased high, biased low, and unbiased (N = 17); and (3) biased low, unbiased, and biased high (N = 15). 8 other adults (3 female; mean ± SD age of the 8 adults: 32 ± 13.95) only partially completed the study, which supplemented data in the unbiased experiment. Data from 5 of these participants supplemented data in the biased low experiments.

We collected pilot data for 9 participants and fit the full Bayesian and no-distractor models to this data. Computing the effect size based on the negative log-likelihood of the model fits, and using a power value of 0.2, we then calculated the minimum required sample size for our experiment. For this, we used the function pingouin.power-ttest (Python). The minimum required sample size was n = 16.04, which indicated that our experimental sample size of 56 participants in the unbiased session and 48 participants across both of the biased sessions was sufficient to capture the effect of distractors on participant behavior. A larger cohort also helped ensure that we could study variability in the underlying factors across real-world participants. Additionally, the cohort size is similar to other comparable studies^{30,41,90,91,92}.

Experimental setup

During the training part of the task, participants were presented with 35 total trials. First, they were presented with 16 easy trials in which the frequencies of all three tones were signal and drawn from the mean value $\pm$ 1 $\sigma$ of either the low- or high-frequency Gaussian (signal) distributions. They were then presented with 10 more difficult trials in which two of the three tones were signal: two tones were drawn from the mean $\pm$ 1 $\sigma$ of either the low- or high-frequency Gaussians, whereas the third was distractor and drawn from the extremes of the uniform distractor distribution ($<{\mu }_{L}-4\sigma$ or $>{\mu }_{H}+4\sigma$). Finally, participants performed 9 very difficult trials in which only one of the three tones was signal and the other two were distractors. Before the ‘more difficult’ and ‘very difficult’ training sets, we informed the participants that the trials in those sets would comprise of one or two distractor tones respectively. Thus, the progression of the ‘easy’, ‘more difficult’ and ‘very difficult’ trials allowed us to systematically introduce the distractor tones to the participants. They were informed to ignore the distractor tones and only make a category decision based on the signal tones. We used the same training paradigm for all the 3 sessions – unbiased, biased low, and biased high. However, we did not inform them about the potentially biased nature of a session. Therefore, any information about the long-term bias was gained through exposure to the stimuli during the testing phase of the experiment.

In the testing phase, we had 600 trials in the unbiased session and 800 trials in each of the biased sessions. We randomized the trials and divided them into 4 blocks of 150 trials each for the unbiased session and 5 blocks of 160 trials for the biased sessions. The participants could optionally take a break (maximum: 3 min) between each block. Participants had to respond within 1.8 s after offset of the last tone and were given visual feedback about their report after each trial. The feedback duration was 0.8 s, and the participants were shown either ‘Correct’, ‘Not quite’, or ‘No response. Please respond faster’. Additionally, to ensure data quality, participants were not allowed to skip or incorrectly answer 10 trials in a row; we gave them an ‘attention warning’ if 5 continuous trials were skipped or answered incorrectly. In the unbiased session, the median number of no-response trials was 1, and the 90^th percentile was 3.8. The number of no-response trials was also low during biased low (median: 1; 90^th percentile: 7) and biased high (median:1; 90^th percentile: 11.4) sessions. We excluded the no-response trials when we computed participant accuracy.

At the end of each experimental session, participants were asked for their consent to be recalled for the next session. The dates of the data collection were staggered, and we conducted each subsequent session after a variable number of days ranging from (1–30 days). We could not identify any relationship between the days between sessions and a participant’s behavior.

Raw data analysis

Figure 2A-B: To compute the influence of each tone frequency on category choice probability $\text{p}\left(\widehat{\text{High}}\right)$, we considered the tones in the region where $\text{p}\left(\text{signal}\right)>\text{p}\left(\text{distractor}\right)$. For each of these tones, we calculated the average category choice when that tone was presented as a signal tone. Then, we computed the influence as the normalized correlation coefficient between these tone frequencies and the corresponding category choice probabilities. We repeated this analysis focusing only on trials where the tones were presented as a distractor tone. We additionally tested whether the position of a tone within each trial (e.g., first, second or third) affected participants’ decisions and found that these correlations were independent of tone positions (Fig S1A; Table 1, one-way repeated measures ANOVA (rmANOVA)). This analysis suggests that participants weighed all three tone positions equally within a trial and integrated information across the tones when making their category choice, and therefore our analyses could treat tone types in all three tone positions the same.

Accuracy

We computed each participant’s accuracy by comparing their response to a given trial’s ‘true’ category. ‘True’ category is the category which is used to generate the trial using Eq. 2.

Computational models

Bayesian models

We assumed that participants performed Bayesian inference over a generative model when solving the categorization task: given their sensory evidence of the tones ${m}_{i}$ where $i\in \{\text{1,2},3\}$, participants computed the posterior probability,

$$p\left(C|{\varvec{m}}\right)=\frac{p\left({\varvec{m}}|C\right)p\left(C\right)}{p\left({\varvec{m}}\right)}$$

(6)

where $C$ is the category choice ($\widehat{\text{High}}$ ($\widehat{\text{H}}$) or $\widehat{\text{Low}}$ ($\widehat{\text{L}}$)). In this equation, the likelihood $, p\left({\varvec{m}}|C\right)$ that the sequence of sensory evidence ${\varvec{v}}$ belonged to a particular category was calculated by marginalizing over a broad range of hypothesized tone frequencies ${\varvec{f}}=\{{f}_{1},{f}_{2},{f}_{3}\}$. To account for the entire frequency range of human hearing^93,94, both ${\varvec{f}} \text{ and } {\varvec{m}}\in \left\{3.98, \text{39,810}\right\}\text{ Hz}.$ In addition, we introduced a dummy variable ${\varvec{r}}=\left\{{r}_{1},{r}_{2},{r}_{3}\right\}$ that explicitly characterized whether a participant deemed each tone’s sensory evidence (${m}_{i}$) as ‘relevant’ (denoted by R) or ‘irrelevant’ (given by I) to their category decision. Thus, the likelihood of a high-category three-tone trial is,

$$p\left({\varvec{m}}|H\right)={\sum }_{{\varvec{r}}}p\left({\varvec{m}}|H,{\varvec{r}}\right)p\left({\varvec{r}}|H\right)$$

(7)

$$={\sum }_{{\varvec{r}}}\sum_{{\varvec{f}}}p({\varvec{m}}\left|H,{\varvec{r}},{\varvec{f}}\right)p\left({\varvec{f}}|H, {\varvec{r}}\right)p\left({\varvec{r}}|H\right).$$

(8)

Each hypothesized tone ${f}_{i}\text{ generated sensory evidence }{m}_{i}\text{ according to the probability density }p\left({m}_{i}|{f}_{i}\right)$, which is a Gaussian characterized by the participant’s sensory uncertainty and noise in the auditory pathway (denoted using parameter ${\upsigma }_{sensory}$). Thus,

$$p\left({\varvec{m}}|H,{\varvec{r}},{\varvec{f}}\right)=p\left({m}_{1}|{f}_{1}\right)p\left({m}_{2}|{f}_{2}\right)p\left({m}_{3}|{f}_{3}\right).$$

(9)

The details of how we computed the value of ${\sigma }_{sensory}$ are given below in Methods—model fit and predictions.

Next, because the hypothesized tones and participant’s estimated relevance are conditionally independent given trial category, we can rewrite:

$${\sum }_{{\varvec{r}}}p\left({\varvec{f}}|H,{\varvec{r}}\right)p\left({\varvec{r}}|H\right)={\sum }_{{\varvec{r}}}p\left({f}_{1}|H,{r}_{1}\right)p\left({f}_{2}|H,{r}_{2}\right)p\left(f|H,{r}_{3}\right)p\left({\varvec{r}}|H\right).$$

(10)

Additionally, because we observed that frequencies in each tone position are almost equally correlated with behavior (Fig S1A, S5), the relevance of a tone (denoted by parameter ${p}_{distractor}$) is agnostic of its order in the three-tone sequence. Using the generative model (see Eqs. 4 and 5), we can then expand Eq. 10 as follows,

$${\sum }_{{\varvec{r}}}p\left({f}_{1}|H,{r}_{1}\right)p\left({f}_{2}|H,{r}_{2}\right)p\left({f}_{3}|H,{r}_{3}\right)p\left({\varvec{r}}|H\right)={\prod }_{i=\text{1,2},3}p\left({f}_{i}|H,{r}_{i}=R\right)p\left(R|H\right)+p\left({f}_{i}|H,{r}_{i}=I\right)p\left(I|H\right)$$

(11)

$$={\prod }_{i=\text{1,2},3}{p}_{GH}\left({f}_{i}|{\upmu }_{high},\upsigma \right)\left(1-{p}_{distractor}\right)+{p}_{U}\text{(}{f}_{i}){p}_{distractor}.$$

(12)

When computing the posterior in Eq. 6, $p\left(C\right)={p}_{low}$ for category choice $\widehat{\text{Low}}$ and ${p}_{high}$ for category choice $\widehat{\text{High}}$. Here, ${\mu }_{low}, {\mu }_{high}, \sigma , {p}_{distractor}, \text{ and } {p}_{low}$ are the participants’ estimated values of the true experimental parameters ${\mu }_{L}, {\mu }_{H}, {\sigma }_{expt}, {p}_{D}, { \text{ and } p}_{L}$ and are fitting parameters in the Bayesian models. When we fit these parameters to the participants’ data, we minimized the negative likelihood (-log(posterior)), which is the cost function.

We compared three main decision-making models in the analysis: (1) A ‘full Bayesian’ model, which assumed that participants probabilistically determined if a tone was ‘signal’ or ‘distractor’; (2) a ‘no-distractor’ model, which assumed that participants considered all of the tones to be ‘signal’; and (3) a ‘random-guess’ model, which assumed that participants considered all of the tones to be ‘distractors’ and thus responded randomly. In the ‘no-distractor’ model, ${p}_{distractor}$ = 0, while in the ‘random-guess’ model, ${p}_{distractor}$ = 1. In these three models, stimulus relevance is agnostic of tone position (Fig. 5). We also compared results from the full Bayesian model to those from an alternative model with position-dependent weights for each tone in the sequence (Fig S6).

The full Bayesian model has 6 fitting parameters – ${\mu }_{low}, {\mu }_{high}, \sigma ,{\sigma }_{sensory},{p}_{distractor},$ and ${p}_{low}$– such that ${0<p}_{distractor}<1$. On the other hand, in the no-distractor model,${r}_{i}=R$ for $i\in \text{1,2},3$ in Eq. 11. Because this model assumes that all of the tones are ‘signal’, we can simplify the likelihood $p\left({\varvec{f}}|H\right)$ into a product of three Gaussians. As a consequence, the no-distractor model has 5 fitting parameters – ${\mu }_{high},{\mu }_{low},\sigma ,{p}_{low}, \text{ and } {\sigma }_{sensory}.$ In the random-guess model, participants categorized trials using random guesses proportional to the probability of the prior, and the 2 fitting parameters are ${p}_{low}$ and ${\sigma }_{sensory}$. More specifically, in this model the posterior in Eq. 6 simplifies to,

$$p\left(C|{\varvec{m}}\right)={\sum }_{{\varvec{f}}}p\left({\varvec{m}}|{\varvec{f}}\right)p\left(C\right)=p\left({m}_{1}|{f}_{1}\right)p\left({m}_{2}|{f}_{2}\right)p\left({m}_{3}|{f}_{3}\right)p\left(C\right).$$

(13)

To capture change in expectations resulting from short- and long-term learning (Fig. 10), we fit an adapted full Bayesian model with a time-varying prior, such that,

$$p{\left(C=L\right)}_{n}= {W}_{0}+{W}_{1}\sum {\mathbb{l}}_{{t}_{C}}{e}^{-t/\tau }.$$

(14)

Here, $p{\left(C=L\right)}_{n}$ denotes a participant’s prior for the low-category choice decision in the ${n}^{th}$ trial. The indicator function ${\mathbb{l}}_{{t}_{C}}$ takes the value 1 if the ground truth of ${\left(n-t\right)}^{th}$ trial was low (L) and it takes the value -1 if it was high (H). ${W}_{constant}=\left|{W}_{0}-0.5\right|*2$ captures the long-term effect on participant’s expectations, ${W}_{1}$ indicates the short-term effect and $\tau$ is the time constant governing the short-term effect.

Last, the model with position-dependent weights has 8 fitting parameters – ${\mu }_{low},{\mu }_{high},\sigma ,{\sigma }_{sensory},{p}_{distracto{r}_{positio{n}_{1}}}, {p}_{distracto{r}_{positio{n}_{2}}}, {p}_{distracto{r}_{positio{n}_{3}}}$ and ${p}_{low}$. ${p}_{distracto{r}_{positio{n}_{1}}}, {p}_{distracto{r}_{positio{n}_{2}}}$ and ${p}_{distracto{r}_{positio{n}_{3}}}$ are participants’ estimates of the probabilities that the first, second and third tone respectively in the sequence are distractors.

To fit each of the above models to the participants’ psychometric curves (see Figs. 6,8,9; Figs S4,5), we applied a decision threshold to the Bayesian model posterior such that,

$$p\left(\widehat{C}|{\varvec{v}}\right)= {\sum }_{{\varvec{m}}}p\left({\varvec{m}}|{\varvec{v}}\right)\left[p\left(C|{\varvec{m}}\right)>0.5\right].$$

(15)

Similar to Eq. 9, we considered that each tone frequency ${v}_{i}$ generated sensory evidence ${m}_{i}$ according to the probability density $p\left({m}_{i}|{v}_{i}\right)$, which is a Gaussian characterized by the participant’s sensory uncertainty (denoted by the parameter ${\upsigma }_{sensory}$).

Boundary model

We constructed a simple boundary-decision model for each participant that had three model decision criteria along the frequency continuum: 1) x_L, the boundary between low-distractor tones and low-signal tones, 2) x_C, the category boundary between low-signal tones and high-signal tones, and 3) x_H, the boundary between high-signal tones and high-distractor tones. Each tone was independently classified as being a distractor (D), low signal (L) or high signal (H). The model for individual tones can be represented as,

$$\hat{c}_{i} \left| {v_{i} } \right. = \left\{ {\begin{array}{*{20}c} L & {x_{L} \le v_{i} < x_{c} } \\ H & {x_{c} \le v_{i} < x_{H} } \\ D & {{\text{Otherwise}}{.}} \\ \end{array} } \right.$$

(16)

Here, ${\widehat{c}}_{i}|{v}_{i}$ is the participant’s inferred generative category for tone i. The participant’s category report is based on the mode of the tone classifications over the three inferred tone categories $\widehat{{\varvec{c}}}$ (a voting model),

$$\widehat{{\varvec{c}}}|{\varvec{\nu}}=argma{x}_{{\widehat{c}}_{i}\in \left\{L,H,D\right\}}p\left(\widehat{{\varvec{c}}}={\widehat{c}}_{i}\right)$$

(17)

Here, c is the participant’s inferred category for the full tone sequence. The category report probability is,

$$p(y = 1\left| {\hat{c}} \right.) = \quad \left\{ {\begin{array}{*{20}c} 1 & {\hat{c}{\kern 1pt} {\kern 1pt} \; = {\kern 1pt} \;H} \\ 0 & {\hat{c}{\kern 1pt} {\kern 1pt} \; = {\kern 1pt} \;L} \\ {0.5} & {{\text{Otherwise}}{.}} \\ \end{array} } \right.$$

(18)

Here, y is the category report, which can be either 0 (“Low”) or 1 (“High”). If the mode of the tone classifications is “Distractor” or there is no single mode, the category report probability is 0.5.

Generalized linear model

We constructed an elastic net (Lasso + second-order Tikhonov) regularized binary logistic regression model⁹⁵ for each participant such that the model predictors are the indicator functions for the three tones in a trial. An indicator function ${\mathbb{l}}_{{t}_{i}}$ takes the value of 1 if tone ${t}_{i}$ is present in the trial and 0 otherwise. There are 30 predictors because a tone frequency can have 30 possible values. The model can be represented as,

$$y\; = \;\left\{ {\begin{array}{*{20}c} 1 & {{\text{expit}}(w_{0} + w_{1} {\mathbb{l}}_{1} ...w_{30} {\mathbb{l}}_{30} ) > 0.5} \\ 0 & {{\text{otherwise}}{.}} \\ \end{array} } \right.$$

(19)

Here, $y$ is the participant’s category report. The $w$ s capture the influence (weight) of the different experimental tone frequencies in a participant’s decision-making process. The function $\text{expit}(x)$ is computed as $\frac{1}{1+\text{exp}\left(-x\right)}.$

To compute the individual influence of the three tone positions in the sequence (Fig S6H), we expand the model to have 90 predictors – 30 for each tone position.

Model Fit and Predictions

Bayesian models: full Bayesian and no-distractor models

We used two approaches to fit the full Bayesian and no-distractor models to the unbiased data. First, we fit all the parameters to the participant data. However, we suspected that we may be overfitting ${\mu }_{low}, {\mu }_{high}, \sigma$ and ${p}_{distractor}$, because the parameters are redundant, and there are multiple configurations that lead to the same likelihood function (Eq. 12) for the Bayesian model. Thus, in the second approach, we considered participants to be veridical about the parameters of the Gaussian distributions (${\mu }_{low}=2.55,$ ${\mu }_{high}=2.85$ and $\sigma =0.1$) and only fit the remaining parameters – ${\sigma }_{sensory}, { p}_{low}$ and for the full Bayesian model, ${p}_{distractor}$. Because both approaches had similar fits (see Figs S5G, H) and to prevent overfitting, we followed the second approach.

In both the approaches, we initially fit these models to a participant’s performance using a coarse parameter grid search, which allows for systematic multi-start optimization. When applicable, we used the following grid values: ${\mu }_{low}\in \left\{2.1, 2.25, 2.4, 2.55, 2.7\right\},{ \mu }_{high}\in \left\{2.7, 2.85, 3, 3.15, 3.3\right\},$ $\sigma \in \left\{0.05, 0.25, 0.45, 0.65, 0.85\right\}$, ${p}_{distractor}\in \left\{0.05, 0.25, 0.45, 0.65, 0.85\right\}$ and ${p}_{low}$ was constrained by each participant’s correct performance. We also assumed that ${\sigma }_{sensory}$ is participant and task dependent but not model dependent. Moreover, because the essential experimental goal was the same in both the unbiased and the biased sessions, we assumed that ${\sigma }_{sensory}$ was constant across the three session types. Thus, instead of running a separate psychophysics experiment to compute ${\sigma }_{sensory},$ we used data from the unbiased session for trials with all three tones drawn from the signal distributions and no distractor tones. We fit this data using a smaller number of model parameters – ${\mu }_{low}, {\mu }_{high}, \sigma , {\sigma }_{sensory}, { \text{and }p}_{low}$. When the remaining 4 parameters were systematically varied, we found that a participant’s ${\sigma }_{sensory}$ was essentially constant; across the population: median – 0.19; 10^th percentile – 0.14; and 90^th percentile – 0.29. These values are consistent with previous findings³⁰.

Next, like the unbiased session, we also fit data from the two biased sessions using the second fitting approach. We tested our fitting procedure by simulating 4 virtual participants (Table 2). These were modeled based on 4 actual participants from our dataset to ensure a consistency check for performance accuracy. Additionally, for the virtual participants, we computed the confidence intervals on their fitted parameters by simulating behavior 10 times and following the bootstrapping procedure as detailed below.

Table 2 Parameter reconstruction for 4 virtual participants.

Full size table

Specifically, we generated confidence intervals by creating either 100 bootstrapped (with replacement) datasets comprising of 600 trials each for the unbiased session (Fig. 6, Fig S5) or 100 balanced datasets for the biased tasks (Fig. 10, Fig S10). A balanced dataset was constructed using all trials from the underrepresented category and subsampling an equal number of trials from the overrepresented category. This balanced dataset separated the external bias in the experiment (captured by ${p}_{L}=0.7$ or ${p}_{H}=0.7$) from the internalized expectation of the participants, as the latter can vary from 0 to 1. For example, participant in Fig. 9A has negligible internalized expectation of bias, whereas participants in Figs S10C, D have high internalized expectations. These data were fit with “post-grid search”, which is a finer-resolution optimization routine using the Nelder-Mead method from scipy.optimize.minimize (Python). During this optimization, we allowed ${\sigma }_{sensory}$ to vary within $\pm$ 0.02.

When fitting the adapted full Bayesian model to participant performance, we reduced the number of fitting parameters from 5 $({p}_{distractor}, {\sigma }_{sensory}, {W}_{0}, {W}_{1}$ and $\tau$) to 3 by using the median values of ${p}_{distractor}$ and ${\sigma }_{sensory}$. The parameter median values were calculated from the previous full Bayesian model fits to the bootstrapped (unbiased) or the balanced (biased low and biased high) datasets. To generate confidence intervals for ${W}_{0}, {W}_{1}$ and $\tau$, for the unbiased data, we used 10 subsets of 500 contiguous trials and for the biased data, 10 subsets of 600 contiguous trials. Additionally, we fit these parameters using the following grid values: ${W}_{0}\in \left\{\text{0.35,0.38}\dots 0.66\right\}, {W}_{1}\in \{0, 0.1\dots 1\}$ and $\tau \in {10}^{\{-1,-0.85 \dots 0.7\}}$.

Random-guess model

Because this model has only two free parameters, ${\sigma }_{sensory}{ \text{ and }p}_{low}$, we simplified the fitting procedure to directly fit either the 100 bootstrapped datasets (unbiased session) or the 100 balanced datasets (biased sessions). We fit ${p}_{low}$ with a grid search over {0, 0.1, 0.2, … ,1}.

Full Bayesian model with position-dependent weights

We reduce the 8 parameters in this model to 4. Participants are assumed to be veridical about the Gaussian parameters and for ${\sigma }_{sensory}$ we use the corresponding median value. Thus, we only fit ${p}_{distracto{r}_{positio{n}_{1}}}$, ${p}_{distracto{r}_{positio{n}_{2}}}{p}_{distracto{r}_{positio{n}_{3}}}$ and ${p}_{low}$. The fitting procedure is similar to the Full Bayesian model.

Metrics

Across the three experimental sessions, for nearly all the participants, we used the final model parameter values of the Full Bayesian model to compute their psychometric-curve fits and their Bayesian model posteriors (Figs. 5,7,8,9; Figs S3,4,10). Because we had our trials had 3 tones, we plotted the average psychometric curve across all 3 tone positions.

Average curve = $(p\left(High|ton{e}_{positio{n}_{1}}\right)+p\left(High|ton{e}_{positio{n}_{2}}\right)+p\left(High|ton{e}_{positio{n}_{3}}\right))/3$.

We tested the goodness of fit of the different models by comparing their Bayesian Information Criterion (BIC) values, which penalizes model complexity and encourages fits of simpler models (Figs. 6; Fig S10). We also tested the results using the Akaike Information Criterion and the corrected Akaike Information Criterion⁹⁶, which were consistent with the BIC analysis.

From the full Bayesian model fits, we derived ‘sigmoidicity’ for each participant. This metric encapsulates the nature of the shape of the psychometric curve and thus, in turn the category choice of the participant. It is ~ 0 for participants who can accurately identify distractor tones as irrelevant to their category decisions (example participant in Fig. 5B) and is ~ 1 for participants whose psychometric curve resembles a step function. Specifically, for each participant sigmoidicity is computed using their respective posterior curve (examples in Figs S3B-D, S4B, E, H) as follows:

$$\frac{\Sigma |\text{p}({\widehat{\text{High}})}_{\text{distractor sensory percepts}}- \frac{\left(\text{p}{(\widehat{\text{High}})}_{\text{sensory percept}=3.98\text{ Hz}}+\text{ p}{(\widehat{\text{High}})}_{\text{sensory percept}=39810.72\text{ Hz}}\right)}{2}|}{\Sigma |\text{p}{(\widehat{\text{High}})}_{\text{all sensory percepts}}- \frac{\left(\text{p}{(\widehat{\text{High}})}_{\text{sensory percept}=3.98\text{ Hz}}+\text{ p}{(\widehat{\text{High}})}_{\text{sensory percept}=39810.72\text{ Hz}}\right)}{2}|},$$

where the distractor sensory percept is more likely to be generated by the distractor tones than by the signal tones. Thus, in this formula, distractor sensory percept $\in {\{3.98,\upmu }_{\text{low}}-2.1\upsigma -{\text{median}(\upsigma }_{\text{sensory}})\}\text{Hz} \text{and }\{{\upmu }_{\text{high}}+2.1\upsigma +\text{median}({\upsigma }_{\text{sensory}}), 39810.72\}$ Hz.

Generalized Linear model (GLM)

We fit a GLM to each participant’s psychophysical performance by choosing a weighting parameter α using ten-fold cross-validation. α is a parameter controlling a convex combination of Lasso and second-order Tikhonov regularization. $\alpha \; = \;0$ implies only Lasso regularization while $\alpha \; = \;1$ implies only Tikohonov regularization. For example, $\alpha$ for the curve fit in Fig S3E is 0.3 whereas it is 0 for those in Figs S3F and S3G. The Tikhonov matrix ($\tau$) is designed such that ${\tau }_{ii}=-1, {\tau }_{i,i-1}=2$ and ${\tau }_{i,i+1}= -1.$

The individual influence of the three tone positions in the sequence (Fig S5H) is computed by taking the sum of the absolute values of the corresponding GLM weights for each of the 30 tone frequencies. The weights for the first tone position are ${w}_{1}\dots {w}_{30}$, for the second tone position are ${w}_{31}\dots {w}_{60}$ and for the third tone position are ${w}_{61}\dots {w}_{90}$. We tested our fitting procedure by simulating 6 virtual participants (Table 3). The virtual participants were based on real experimental participants.

Table 3 Validating the GLM for 6 virtual participants.

Full size table

Boundary model

We fit a boundary model to each participant’s decision behaviour by first setting the constraint ${x}_{L}<{x}_{c}<{x}_{H}$. Because the stimulus set consisted of 30 unique tones, there were only 31 meaningful boundary locations. Once the order constraint was included, 4494 possible combinations of the 3 criteria remained, and the minimum negative log-likelihood was found by optimizing the model over all possible combinations.

Data and Code Availability

The de-identified data and code associated with this paper are publicly available and can be accessed at https://github.com/geffenlab/UncertaintyRelevanceLearningInterplay. We have also used Zenodo to assign a DOI to the repository: https://doi.org/10.5281/zenodo.7439086.

References

Heekeren, H. R., Marrett, S. & Ungerleider, L. G. The neural systems that mediate human perceptual decision making. Nat. Rev. Neurosci. 9, 467–479. https://doi.org/10.1038/nrn2374 (2008).
Article CAS PubMed Google Scholar
Chen, S. Y., Ross, B. H. & Murphy, G. L. Decision making under uncertain categorization. Front. Psychol. https://doi.org/10.3389/fpsyg.2014.00991 (2014).
Article PubMed PubMed Central Google Scholar
Niwa, M. & Ditterich, J. Perceptual decisions between multiple directions of visual motion. J. Neurosci. 28, 4435–4445. https://doi.org/10.1523/JNEUROSCI.5564-07.2008 (2008).
Article CAS PubMed PubMed Central MATH Google Scholar
Bushdid, C., Magnasco, M. O., Vosshall, L. B. & Keller, A. humans can discriminate more than 1 Trillion olfactory stimuli. Science https://doi.org/10.1126/science.1249168 (2014).
Article PubMed PubMed Central Google Scholar
Garcia, S. E., Jones, P. R., Rubin, G. S. & Nardini, M. Auditory localisation biases increase with sensory uncertainty. Sci. Rep. 7, 40567. https://doi.org/10.1038/srep40567 (2017).
Article ADS CAS PubMed PubMed Central Google Scholar
Heron, J., Whitaker, D. & McGraw, P. V. Sensory uncertainty governs the extent of audio-visual interaction. Vis. Res. 44, 2875–2884. https://doi.org/10.1016/j.visres.2004.07.001 (2004).
Article CAS PubMed MATH Google Scholar
Pouget, A., Beck, J. M., Ma, W. J. & Latham, P. E. Probabilistic brains: knowns and unknowns. Nat. Neurosci. 16, 1170–1178. https://doi.org/10.1038/nn.3495 (2013).
Article CAS PubMed PubMed Central Google Scholar
Qamar, A. T. et al. Trial-to-trial, uncertainty-based adjustment of decision boundaries in visual categorization. Proc. Natl. Acad. Sci. 110, 20332–20337. https://doi.org/10.1073/pnas.1219756110 (2013).
Article ADS CAS PubMed PubMed Central Google Scholar
Beierholm, U., Rohe, T., Ferrari, A., Stegle, O. & Noppeney, U. Using the past to estimate sensory uncertainty. eLife 9, e54172. https://doi.org/10.7554/eLife.54172 (2020).
Article CAS PubMed PubMed Central Google Scholar
Barthelmé, S. & Mamassian, P. Evaluation of objective uncertainty in the visual system. PLOS Comput. Biol. 5, e1000504. https://doi.org/10.1371/journal.pcbi.1000504 (2009).
Article ADS MathSciNet CAS PubMed PubMed Central MATH Google Scholar
Zhou, Y., Acerbi, L. & Ma, W. J. The role of sensory uncertainty in simple contour integration. PLOS Comput. Biol. 16, e1006308. https://doi.org/10.1371/journal.pcbi.1006308 (2020).
Article ADS CAS PubMed PubMed Central Google Scholar
Körding, K. P. & Wolpert, D. M. Bayesian integration in sensorimotor learning. Nature 427, 244–247. https://doi.org/10.1038/nature02169 (2004).
Article ADS CAS PubMed MATH Google Scholar
Wei, K. & Körding, K. Relevance of error: What drives motor adaptation?. J. Neurophysiol. 101, 655–664. https://doi.org/10.1152/jn.90545.2008 (2009).
Article PubMed MATH Google Scholar
Daliri, A. & Dittman, J. Successful auditory motor adaptation requires task-relevant auditory errors. J. Neurophysiol. 122, 552–562. https://doi.org/10.1152/jn.00662.2018 (2019).
Article PubMed Google Scholar
Anders, U. M., McLean, C. S., Ouyang, B. & Ditterich, J. Perceptual Decisions in the Presence of Relevant and Irrelevant Sensory Evidence. Front. Neurosci. https://doi.org/10.3389/fnins.2017.00618 (2017).
Article PubMed PubMed Central MATH Google Scholar
Mirza, M. B., Adams, R. A., Friston, K. & Parr, T. Introducing a Bayesian model of selective attention based on active inference. Sci. Rep. 9, 13915. https://doi.org/10.1038/s41598-019-50138-8 (2019).
Article ADS CAS PubMed PubMed Central MATH Google Scholar
Lutfi, R. A., Kistler, D. J., Callahan, M. R. & Wightman, F. L. Psychometric functions for informational masking. J. Acoust. Soc. Am. 114, 3273–3282 (2003).
Article ADS PubMed Google Scholar
Oh, E. L., Wightman, F. & Lutfi, R. A. Children’s detection of pure-tone signals with random multitone maskers. J. Acoust. Soc Am. 109, 2888–2895 (2001).
Article ADS CAS PubMed MATH Google Scholar
Lutfi, R. A. Informational processing of complex sound. III: Interference. J. Acoust. Soc. Am. 91, 3391–3401. https://doi.org/10.1121/1.402829 (1992).
Article ADS CAS PubMed MATH Google Scholar
Hansen, K., Hillenbrand, S., and Ungerleider, L. Effects of Prior Knowledge on Decisions Made Under Perceptual vs. Categorical Uncertainty. Front. Neurosci. 6 (2012)
Kok, P., Brouwer, G. J., van Gerven, M. A. J. & de Lange, F. P. Prior expectations bias sensory representations in visual cortex. J. Neurosci. 33, 16275–16284. https://doi.org/10.1523/JNEUROSCI.0742-13.2013 (2013).
Article CAS PubMed PubMed Central Google Scholar
Rahnev, D., Lau, H. & de Lange, F. P. Prior expectation modulates the interaction between sensory and prefrontal regions in the human brain. J. Neurosci. 31, 10741–10748. https://doi.org/10.1523/JNEUROSCI.1478-11.2011 (2011).
Article CAS PubMed PubMed Central MATH Google Scholar
Kok, P., Mostert, P. & de Lange, F. P. Prior expectations induce prestimulus sensory templates. Proc. Natl. Acad. Sci. 114, 10473–10478. https://doi.org/10.1073/pnas.1705652114 (2017).
Article ADS CAS PubMed PubMed Central Google Scholar
Rohenkohl, G., Cravo, A. M., Wyart, V. & Nobre, A. C. Temporal expectation improves the quality of sensory information. J. Neurosci. 32, 8424–8428. https://doi.org/10.1523/JNEUROSCI.0804-12.2012 (2012).
Article CAS PubMed PubMed Central MATH Google Scholar
Stocker, A. A. & Simoncelli, E. P. Noise characteristics and prior expectations in human visual speed perception. Nat. Neurosci. 9, 578–585. https://doi.org/10.1038/nn1669 (2006).
Article CAS PubMed MATH Google Scholar
Mendonça, A. G. et al. The impact of learning on perceptual decisions and its implication for speed-accuracy tradeoffs. Nat. Commun. 11, 2757. https://doi.org/10.1038/s41467-020-16196-7 (2020).
Article ADS CAS PubMed PubMed Central Google Scholar
Holt, L. L. Speech categorization in context: Joint effects of nonspeech and speech precursors. J. Acoust. Soc. Am. 119, 4016–4026. https://doi.org/10.1121/1.2195119 (2006).
Article ADS PubMed MATH Google Scholar
Kluender, K. R., Coady, J. A. & Kiefte, M. Sensitivity to change in perception of speech. Speech Commun. 41, 59–69. https://doi.org/10.1016/S0167-6393(02)00093-6 (2003).
Article PubMed PubMed Central Google Scholar
Holt, L. L., Lotto, A. J. & Kluender, K. R. Neighboring spectral content influences vowel identification. J. Acoust. Soc. Am. 108, 710–722. https://doi.org/10.1121/1.429604 (2000).
Article ADS CAS PubMed Google Scholar
Gifford, A. M., Cohen, Y. E. & Stocker, A. A. Characterizing the impact of category uncertainty on human auditory categorization behavior. PLOS Comput. Biol. 10, e1003715. https://doi.org/10.1371/journal.pcbi.1003715 (2014).
Article ADS CAS PubMed PubMed Central MATH Google Scholar
Berniker, M., Voss, M. & Kording, K. Learning priors for bayesian computations in the nervous system. PLOS ONE 5, e12686. https://doi.org/10.1371/journal.pone.0012686 (2010).
Article ADS CAS PubMed PubMed Central Google Scholar
Simoncelli, E. P. & Olshausen, B. A. Natural image statistics and neural representation. Annu. Rev. Neurosci. 24, 1193–1216. https://doi.org/10.1146/annurev.neuro.24.1.1193 (2001).
Article CAS PubMed MATH Google Scholar
Behrens, T. E. J., Woolrich, M. W., Walton, M. E. & Rushworth, M. F. S. Learning the value of information in an uncertain world. Nat. Neurosci. 10, 1214–1221. https://doi.org/10.1038/nn1954 (2007).
Article CAS PubMed MATH Google Scholar
Green, D. M. & Swets, J. A. Signal detection theory and psychophysics (John Wiley, 1966).
MATH Google Scholar
Watson, C. S. & Kidd, G. R. Studies of tone sequence perception: Effects of uncertainty, familiarity, and selective attention. Front. Biosci. 12, 3355–3366. https://doi.org/10.2741/2318 (2007).
Article PubMed MATH Google Scholar
Russ, B. E., Lee, Y.-S. & Cohen, Y. E. Neural and behavioral correlates of auditory categorization. Hear Res. 229, 204–212. https://doi.org/10.1016/j.heares.2006.10.010 (2007).
Article PubMed MATH Google Scholar
Banno, T., Lestang, J.-H. & Cohen, Y. E. Computational and neurophysiological principles underlying auditory perceptual decisions. Curr. Opin. Physiol. 18, 20–24. https://doi.org/10.1016/j.cophys.2020.07.001 (2020).
Article PubMed PubMed Central MATH Google Scholar
Ley, A. et al. Learning of new sound categories shapes neural response patterns in human auditory cortex. J. Neurosci. 32, 13273–13280. https://doi.org/10.1523/JNEUROSCI.0584-12.2012 (2012).
Article CAS PubMed PubMed Central Google Scholar
Tsunada, J. & Cohen, Y. E. Neural mechanisms of auditory categorization: From across brain areas to within local microcircuits. Front. Neurosci. https://doi.org/10.3389/fnins.2014.00161 (2014).
Article PubMed PubMed Central MATH Google Scholar
Cao, Y., Summerfield, C., Park, H., Giordano, B. L. & Kayser, C. Causal inference in the multisensory brain. Neuron 102, 1076-1087.e8. https://doi.org/10.1016/j.neuron.2019.03.043 (2019).
Article CAS PubMed Google Scholar
Acerbi, L., Dokka, K., Angelaki, D. E. & Ma, W. J. Bayesian comparison of explicit and implicit causal inference strategies in multisensory heading perception. PLOS Comput. Biol. 14, e1006110. https://doi.org/10.1371/journal.pcbi.1006110 (2018).
Article ADS CAS PubMed PubMed Central Google Scholar
Rigoli, F., Pezzulo, G., Dolan, R. & Friston, K. A goal-directed bayesian framework for categorization. Front. Psychol. https://doi.org/10.3389/fpsyg.2017.00408 (2017).
Article PubMed PubMed Central MATH Google Scholar
Adler, W. T. & Ma, W. J. Comparing bayesian and non-bayesian accounts of human confidence reports. PLOS Comput. Biol. 14, e1006572. https://doi.org/10.1371/journal.pcbi.1006572 (2018).
Article ADS CAS PubMed PubMed Central MATH Google Scholar
Knill, D. C. & Pouget, A. The Bayesian brain: The role of uncertainty in neural coding and computation. Trends Neurosci. 27, 712–719. https://doi.org/10.1016/j.tins.2004.10.007 (2004).
Article CAS PubMed MATH Google Scholar
Kording, K. P. Bayesian statistics: Relevant for the brain?. Curr. Opin. Neurobiol. 25, 130–133. https://doi.org/10.1016/j.conb.2014.01.003 (2014).
Article CAS PubMed MATH Google Scholar
Piasini, E., Balasubramanian, V. & Gold, J. I. Effect of Geometric Complexity on Intuitive Model Selection. In Machine Learning, Optimization, and Data Science Lecture Notes in Computer Science (eds Nicosia, G. et al.) (Springer International Publishing, 2022).
MATH Google Scholar
Weiss, Y., Simoncelli, E. P. & Adelson, E. H. Motion illusions as optimal percepts. Nat. Neurosci. 5, 598–604. https://doi.org/10.1038/nn0602-858 (2002).
Article CAS PubMed Google Scholar
Rangelov, D. & Mattingley, J. B. Evidence accumulation during perceptual decision-making is sensitive to the dynamics of attentional selection. Neuroimage 220, 117093. https://doi.org/10.1016/j.neuroimage.2020.117093 (2020).
Article PubMed Google Scholar
Nett, N., Bröder, A. & Frings, C. When irrelevance matters: Stimulus-response binding in decision making under uncertainty. J. Exp. Psychol. Learn. Mem. Cogn. 41, 1831–1848. https://doi.org/10.1037/xlm0000109 (2015).
Article PubMed MATH Google Scholar
de Winkel, K. N., Katliar, M. & Bülthoff, H. H. Forced fusion in multisensory heading estimation. PLOS ONE 10, e0127104. https://doi.org/10.1371/journal.pone.0127104 (2015).
Article CAS PubMed PubMed Central Google Scholar
Wozny, D. R., Beierholm, U. R. & Shams, L. Probability matching as a computational strategy used in perception. PLOS Comput. Biol. 6, e1000871. https://doi.org/10.1371/journal.pcbi.1000871 (2010).
Article ADS MathSciNet CAS PubMed PubMed Central MATH Google Scholar
Rahnev, D. & Denison, R. N. Suboptimality in perceptual decision making. Behav. Brain Sci. 41, e223. https://doi.org/10.1017/S0140525X18000936 (2018).
Article PubMed PubMed Central Google Scholar
Ding, N. & Simon, J. Z. Emergence of neural encoding of auditory objects while listening to competing speakers. Proc. Natl. Acad. Sci. 109, 11854–11859. https://doi.org/10.1073/pnas.1205381109 (2012).
Article ADS PubMed PubMed Central MATH Google Scholar
Mesgarani, N. & Chang, E. F. Selective cortical representation of attended speaker in multi-talker speech perception. Nature 485, 233–236. https://doi.org/10.1038/nature11020 (2012).
Article ADS CAS PubMed MATH Google Scholar
Schwartz, Z. P. & David, S. V. Focal suppression of distractor sounds by selective attention in auditory cortex. Cereb. Cortex 28, 323–339. https://doi.org/10.1093/cercor/bhx288 (2018).
Article PubMed Google Scholar
Jensen, A., Merz, S., Spence, C. & Frings, C. Interference of irrelevant information in multisensory selection depends on attentional set. Atten. Percept. Psychophys. 82, 1176–1195. https://doi.org/10.3758/s13414-019-01848-8 (2020).
Article PubMed Google Scholar
Chen, J., Scotti, P. S., Dowd, E. W. & Golomb, J. D. Neural representations of task-relevant and task-irrelevant features of attended objects. Prepr. bioRxiv https://doi.org/10.1101/2021.05.21.445168 (2021).
Article Google Scholar
Xu, Y. The neural fate of task-irrelevant features in object-based processing. J. Neurosci. 30, 14020–14028. https://doi.org/10.1523/JNEUROSCI.3011-10.2010 (2010).
Article CAS PubMed PubMed Central MATH Google Scholar
Hansen, K., Hillenbrand, S. & Ungerleider, L. Persistency of priors-induced bias in decision behavior and the fMRI signal. Front. Neurosci. https://doi.org/10.3389/fnins.2011.00029 (2011).
Article PubMed PubMed Central MATH Google Scholar
Roark, C. L. & Holt, L. L. Long-term priors constrain category learning in the context of short-term statistical regularities. Psychon. Bull. Rev. https://doi.org/10.3758/s13423-022-02114-z (2022).
Article PubMed MATH Google Scholar
Hansen, K. A., Hillenbrand, S. F. & Ungerleider, L. G. Human brain activity predicts individual differences in prior knowledge use during decisions. J. Cogn. Neurosci. 24, 1462–1475. https://doi.org/10.1162/jocn_a_00224 (2012).
Article PubMed PubMed Central Google Scholar
Gekas, N., McDermott, K. C. & Mamassian, P. Disambiguating serial effects of multiple timescales. J. Vis. 19, 24. https://doi.org/10.1167/19.6.24 (2019).
Article PubMed MATH Google Scholar
Tardiff, N., Suriya-Arunroj, L., Cohen, Y. E. & Gold, J. I. Rule-based and stimulus-based cues bias auditory decisions via different computational and physiological mechanisms. PLoS Comput Biol 18(10), e1010601 https://doi.org/10.1371/journal.pcbi.1010601 (2022).
Ming, V. L. & Holt, L. L. Efficient coding in human auditory perception. J. Acoust. Soci. Am. 126, 1312–1320. https://doi.org/10.1121/1.3158939 (2009).
Article ADS MATH Google Scholar
Smith, E. C. & Lewicki, M. S. Efficient auditory coding. Nature 439, 978–982. https://doi.org/10.1038/nature04485 (2006).
Article ADS CAS PubMed MATH Google Scholar
Stilp, C. E. & Kluender, K. R. Stimulus statistics change sounds from near-indiscriminable to hyperdiscriminable. PLOS ONE 11, e0161001. https://doi.org/10.1371/journal.pone.0161001 (2016).
Article CAS PubMed PubMed Central Google Scholar
Heald, J. B., Lengyel, M. & Wolpert, D. M. Contextual inference underlies the learning of sensorimotor repertoires. Nature 600, 489–493. https://doi.org/10.1038/s41586-021-04129-3 (2021).
Article ADS CAS PubMed PubMed Central MATH Google Scholar
Knill, D. C. & Richards, W. Perception as Bayesian Inference (Cambridge University Press, 1996).
Book MATH Google Scholar
Kidd, G. R., Watson, C. S. & Gygi, B. Individual differences in auditory abilities. J. Acoust. Soc. Am. 122, 418–435. https://doi.org/10.1121/1.2743154 (2007).
Article ADS PubMed Google Scholar
Oberfeld, D. & Klöckner-Nowotny, F. Individual differences in selective attention predict speech identification at a cocktail party. Elife 5, e16747. https://doi.org/10.7554/eLife.16747 (2016).
Article PubMed PubMed Central Google Scholar
Lutfi, R. A., Pastore, T., Rodriguez, B., Yost, W. A. & Lee, J. Molecular analysis of individual differences in talker search at the cocktail-party. J. Acoust. Soc. Am. 152, 1804. https://doi.org/10.1121/10.0014116 (2022).
Article ADS PubMed PubMed Central Google Scholar
Lutfi, R. A., Rodriguez, B., Lee, J. & Pastore, T. A test of model classes accounting for individual differences in the cocktail-party effect. J. Acoust. Soc. Am. 148, 4014. https://doi.org/10.1121/10.0002961 (2020).
Article ADS PubMed PubMed Central MATH Google Scholar
Ashwood, Z. C. et al. Mice alternate between discrete strategies during perceptual decision-making. Nat. Neurosci. https://doi.org/10.1038/s41593-021-01007-z (2022).
Article PubMed PubMed Central Google Scholar
Honig, M., Ma, W. J. & Fougnie, D. Humans incorporate trial-to-trial working memory uncertainty into rewarded decisions. Proc. Natl. Acad. Sci. 117, 8391–8397. https://doi.org/10.1073/pnas.1918143117 (2020).
Article ADS CAS PubMed PubMed Central Google Scholar
Ihlefeld, A. & Shinn-Cunningham, B. Spatial release from energetic and informational masking in a selective speech identification task. J. Acoust. Soc. Am. 123, 4369–4379. https://doi.org/10.1121/1.2904826 (2008).
Article ADS PubMed PubMed Central MATH Google Scholar
Middlebrooks, J. C. & Waters, M. F. Spatial mechanisms for segregation of competing sounds, and a breakdown in spatial hearing. Front. Neurosci. https://doi.org/10.3389/fnins.2020.571095 (2020).
Article PubMed PubMed Central MATH Google Scholar
Bronkhorst, A. W. The cocktail-party problem revisited: Early processing and selection of multi-talker speech. Atten. Percept. Psychophys. 77, 1465–1487. https://doi.org/10.3758/s13414-015-0882-9 (2015).
Article PubMed PubMed Central MATH Google Scholar
Shinn-Cunningham, B. G. Object-based auditory and visual attention. Trends Cogn. Sci. 12, 182–186. https://doi.org/10.1016/j.tics.2008.02.003 (2008).
Article PubMed PubMed Central MATH Google Scholar
Wöstmann, M., Herrmann, B., Maess, B. & Obleser, J. Spatiotemporal dynamics of auditory attention synchronize with speech. Proc. Natl. Acad. Sci. 113, 3873–3878. https://doi.org/10.1073/pnas.1523357113 (2016).
Article ADS CAS PubMed PubMed Central Google Scholar
Holt, L. L. & Lotto, A. J. Speech perception as categorization. Atten. Percept. Psychophys. 72, 1218–1227. https://doi.org/10.3758/APP.72.5.1218 (2010).
Article PubMed PubMed Central MATH Google Scholar
Freedman, D. J., Riesenhuber, M., Poggio, T. & Miller, E. K. A Comparison of Primate Prefrontal and Inferior Temporal Cortices during Visual Categorization. J. Neurosci. 23, 5235–5246. https://doi.org/10.1523/JNEUROSCI.23-12-05235.2003 (2003).
Article CAS PubMed PubMed Central MATH Google Scholar
Zhong, L., Zhang, Y., Duan, C. A., Pan, J. & Xu, N. Dynamic and causal contribution of parietal circuits to perceptual decisions during category learning. Prepr. bioRxiv https://doi.org/10.1101/304071 (2018).
Article MATH Google Scholar
Zhong L, Zhang Y, Duan CA, Deng J, Pan J & Xu NL. Causal contributions of parietal cortex to perceptual decision making during stimulus categorization. Nat Neurosci. 22(6), 963-973. https://doi.org/10.1038/s41593-019-0383-6 (2019).
Akrami, A., Kopec, C. D., Diamond, M. E. & Brody, C. D. Posterior parietal cortex represents sensory history and mediates its effects on behaviour. Nature 554, 368–372. https://doi.org/10.1038/nature25510 (2018).
Article ADS CAS PubMed MATH Google Scholar
Bianco, R. et al. Long-term implicit memory for sequential auditory patterns in humans. eLife 9, e56073. https://doi.org/10.7554/eLife.56073 (2020).
Article PubMed PubMed Central Google Scholar
Grinband, J., Hirsch, J. & Ferrera, V. P. A neural representation of categorization uncertainty in the human brain. Neuron 49, 757–763. https://doi.org/10.1016/j.neuron.2006.01.032 (2006).
Article CAS PubMed MATH Google Scholar
van Bergen, R. S., Ji Ma, W., Pratte, M. S. & Jehee, J. F. M. Sensory uncertainty decoded from visual cortex predicts behavior. Nat. Neurosci. 18, 1728–1730. https://doi.org/10.1038/nn.4150 (2015).
Article CAS PubMed PubMed Central Google Scholar
Fetsch, C. R., Pouget, A., DeAngelis, G. C. & Angelaki, D. E. Neural correlates of reliability-based cue weighting during multisensory integration. Nat. Neurosci. 15, 146–154. https://doi.org/10.1038/nn.2983 (2012).
Article CAS MATH Google Scholar
Woods, K. J. P., Siegel, M. H., Traer, J. & McDermott, J. H. Headphone screening to facilitate web-based auditory experiments. Atten. Percept. Psychophys. 79, 2064–2072. https://doi.org/10.3758/s13414-017-1361-2 (2017).
Article PubMed PubMed Central Google Scholar
Bankieris, K. R., Bejjanki, V. R. & Aslin, R. N. Sensory cue-combination in the context of newly learned categories. Sci. Rep. 7, 10890. https://doi.org/10.1038/s41598-017-11341-7 (2017).
Article ADS CAS PubMed PubMed Central MATH Google Scholar
Bejjanki, V. R., Clayards, M., Knill, D. C. & Aslin, R. N. Cue integration in categorical tasks: insights from audio-visual speech perception. PLOS ONE 6, e19812. https://doi.org/10.1371/journal.pone.0019812 (2011).
Article ADS CAS PubMed PubMed Central Google Scholar
McPherson, M. J. & McDermott, J. H. Diversity in pitch perception revealed by task dependence. Nat. Hum. Behav. 2, 52–66. https://doi.org/10.1038/s41562-017-0261-8 (2018).
Article PubMed MATH Google Scholar
Moller, H. & Pedersen, C. S. Hearing at low and infrasonic frequencies. Noise Health 6, 37 (2004).
CAS PubMed MATH Google Scholar
Ashihara, K. Hearing thresholds for pure tones above 16kHz. J. Acoust. Soc. Am. 122, EL52–EL57. https://doi.org/10.1121/1.2761883 (2007).
Article ADS PubMed MATH Google Scholar
Jas, M. et al. Pyglmnet: Python implementation of elastic-net regularized generalized linear models. J. Open Sour. Softw. 5, 1959. https://doi.org/10.21105/joss.01959 (2020).
Article ADS Google Scholar
Cavanaugh, J. E. Unifying the derivations for the Akaike and corrected Akaike information criteria. Stat. Prob. Lett. 33, 201–208. https://doi.org/10.1016/S0167-7152(96)00128-9 (1997).
Article MathSciNet MATH Google Scholar

Download references

Acknowledgements

We thank the members of the Geffen Lab for advice and feedback on the project. We also thank Doris Dijksterhuis and Sandra Reinert for their ideas regarding data analysis. This work was supported by the NIH grant (R01NS113241) to MNG, YC, and KK. The funders had no role in study design, data collection and analysis, decision to publish or preparation of the manuscript.

Author information

Janaki Sheth, Jared S. Collina, Yale E. Cohen and Maria N. Geffen contributed equally to this work.

Authors and Affiliations

Department of Otorhinolaryngology, University of Pennsylvania, Philadelphia, PA, USA
Janaki Sheth, Yale E. Cohen & Maria N. Geffen
Neuroscience Graduate Group, University of Pennsylvania, Philadelphia, PA, USA
Jared S. Collina
Department of Neuroscience, International School for Advanced Studies, Trieste, Italy
Eugenio Piasini
Department of Neuroscience, University of Pennsylvania, Philadelphia, PA, USA
Konrad P. Kording, Yale E. Cohen & Maria N. Geffen
Department of Bioengineering, University of Pennsylvania, Philadelphia, PA, USA
Konrad P. Kording & Yale E. Cohen
Department of Neurology, University of Pennsylvania, Philadelphia, PA, USA
Maria N. Geffen

Authors

Janaki Sheth
View author publications
Search author on:PubMed Google Scholar
Jared S. Collina
View author publications
Search author on:PubMed Google Scholar
Eugenio Piasini
View author publications
Search author on:PubMed Google Scholar
Konrad P. Kording
View author publications
Search author on:PubMed Google Scholar
Yale E. Cohen
View author publications
Search author on:PubMed Google Scholar
Maria N. Geffen
View author publications
Search author on:PubMed Google Scholar

Contributions

J.S., J.S.C., K.P.K., E.P., Y.E.C. and M.N.G. designed the study. J.S. collected the data. J.S. and J.S.C. analyzed the data. J.S., J.S.C., K.P.K., Y.E.C. and M.N.G. wrote the paper.

Corresponding author

Correspondence to Maria N. Geffen.

Ethics declarations

Competing interests

The authors declare no competing interests in the design or execution of the study.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Supplementary Information.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Sheth, J., Collina, J.S., Piasini, E. et al. The interplay of uncertainty, relevance and learning influences auditory categorization. Sci Rep 15, 3348 (2025). https://doi.org/10.1038/s41598-025-86856-5

Download citation

Received: 13 December 2023
Accepted: 14 January 2025
Published: 27 January 2025
DOI: https://doi.org/10.1038/s41598-025-86856-5

Subjects

Abstract

Similar content being viewed by others

Expectation-driven sensory adaptations support enhanced acuity during categorical perception

Exaggerated perception of change with greater sensory imprecision

Individual-specific strategies inform category learning

Introduction

Results

Participants weigh tones according to stimulus relevance

Participants exhibit high individual variability in performance

Bayesian model can account for participant performance

Sensory uncertainty and estimated task relevance underlie inter-participant variability

Participants rely on past statistics of category probabilities

Discussion

Flexibility of experimental paradigm

Bayesian approach to study variability and interplay of relevance and uncertainty

Learning

Future directions

Methods

Ethics statement

Human psychophysics apparatus

Stimulus creation and task design

Participants

Experimental setup

Raw data analysis

Accuracy

Computational models

Bayesian models

Boundary model

Generalized linear model

Model Fit and Predictions

Bayesian models: full Bayesian and no-distractor models

Random-guess model

Full Bayesian model with position-dependent weights

Metrics

Generalized Linear model (GLM)

Boundary model

Data and Code Availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Publisher’s note

Supplementary Information

Supplementary Information.

Rights and permissions

About this article

Cite this article

Share this article

Search

Quick links