Introduction

For many neurological and psychiatric disorders, individuals struggle to distinguish important from irrelevant cues in the environment1,2,3,4,5. What drives this pathology in processing of stimulus salience is not well understood. In neurotypical brains, a substantial portion of variability in neural responses to auditory6,7,8,9, visual10,11, olfactory12,13, and somatosensory14,15 stimuli is explained by changes in neuromodulatory brain state, or arousal. Thus, one hypothesis for altered saliency processing in brain disorders is that changes in brain state dynamics manifest as sub-optimal cortical16,17, followed by peripheral18,19, and behavioral responses20,21 to incoming stimuli.

There is substantial evidence supporting the idea that neuromodulators released from the reticular activating system (RAS) are intimately involved in saliency processing. The RAS is a network of neurons, predominantly in the brainstem, that influences cognitive functions through its impact on global arousal states. The RAS exhibits two broad activation modes: long-lasting ‘tonic’ and seconds-long ‘phasic’. For example, the locus coeruleus (LC), one of the main RAS nodes, controls arousal in response to salient stimuli, and exhibits tonic low frequency (1–6 Hz) firing, as well as phasic, coordinated bursts at higher frequencies (8–15 Hz)22,23,24. Tonic LC and RAS activity is associated with changes in behavioral state, such that higher tonic firing rates occur with cognitive distractibility during times of increased stress or exploratory behavior, and lower tonic firing occurs during quiet wakefulness and waning task engagement23,25. The influential adaptive gain theory of LC function suggests that bursts of phasic activity exhibit an inverted-U shaped dependence on tonic activity, such that intermediate tonic arousal correspond to the largest phasic responses and optimal down-stream behavioral performance23,] whereas higher or lower levels of tonic arousal correspond to sub-optimal responses in a broad range of tasks9,10,23,26,27,28. Phasic LC activity is driven by internally or externally experienced salient events, such as rewarding or threatening stimuli23,29,30. In response to salient stimuli, tonic state potentiates phasic activations, typically along the inverted-u shaped optimization curve9,23,27. Conversely, incoming phasic responses to salient stimuli in turn modulate subsequent tonic LC activity23,31.

LC and other RAS activity is very difficult to directly measure in humans; thus, the effects of neuromodulatory brain state on saliency responses and their state dependence is much less clear for the human brain. Pupil dilation has been used as a proxy for RAS activation, facilitating investigation. In both animals and humans, pupil dilation is correlated with RAS activation, and the pupil exhibits relationships to behavior that mirror results with RAS recordings in animals32,33,34,35. Fluctuations in pupil size in animals strongly correlates to changes in cortical dynamics. Tonic pupil size exhibits a strong inverse relationship to low-frequency (2–10 Hz) oscillatory activity, which is a hallmark of low arousal9,27,36,37. Larger pupil size exhibits a reduction in these low frequency oscillations and intermediate pupil is associated with robust sound responses and associated phasic pupil dilations and cortical membrane potential depolarizations that, in line with the adaptive gain theory, can improve decisions by reducing choice biases9,30,37. Both the tonic and phasic mode of LC activity are linked with pupil diameter, and the associated patterns of cortical activity27,34,38. Specifically, pupil diameter is highly correlated with LC tonic firing rate, where spontaneous increases in LC activity correspond to a larger pupil diameter, and vice versa32,33,34,35.

In humans, the cortical neural correlates of pupil-indexed brain state are largely undetermined due to the paucity of direct recordings. In the absence of membrane potential recording, which has yet to be achieved in humans, indirect measures from local field potential could serve as proxies. The aperiodic (1/f) slope has recently gained prominence as such a measure. The aperiodic 1/f slope is a summary metric of the commonly observed power law relationship between signal content and frequency, where the power in the signal decreases with increasing frequency, following a power law (1/fa, where ‘a’ is a constant). When plotted logarithmically, this manifests as an approximately linear relationship between signal power and frequency. This ‘1/f effect’ is extensively observed in neural dynamics, where it encapsulates the non-oscillatory background activity, indirectly capturing the relative strength of low and high frequency oscillatory activity in cortical neurons. It has also been proposed to capture aspects of the balance between excitation and inhibition in cortical networks39, which is highly state dependent40. As such, aperiodic 1/f slope has been proposed as an index for physiological arousal, for example distinguishing between anesthesia, sleep, and wakeful states41, where a flatter slope corresponds to greater relative excitation to inhibition balance, which would align with states of increased arousal, and vice versa. Through this lens, 1/f slope may track tonic arousal state, thus explaining variability in pupillary and oscillatory cortical phasic responses to salient stimuli.

In this study, we aim to determine the relationships between 1/f slope, pupil-indexed brain state, and saliency responses in the human brain. To do so, we measured pupil size and direct intracranial recording in auditory cortical areas in awake human subjects during an oddball stimulus paradigm, in which rare ‘oddball’ stimuli should be processed as salient. We find that tonic pupil size is correlated to baseline 1/f slope and phasic pupil size is correlated to the magnitude of high gamma-band response to saliency. Furthermore, phasic high gamma and pupil responses exhibit inverted-U dependencies on 1/f slope measures and salient stimuli result in longer lasting changes in 1/f slope. Taken together, the results provide supporting evidence that 1/f slope is a useful index of cortical state and that saliency responses in human auditory cortex follow the adaptive gain theory.

Results

To investigate the roles of tonic and phasic neuromodulatory brain state changes in saliency processing by the human brain, we utilized an auditory oddball paradigm (Fig. 1a) and recorded in auditory cortical regions of eight patients undergoing treatment for refractory epilepsy (Fig. 1b; Supplemental Table 1).

Fig. 1
figure 1

Auditory Oddball Paradigm and Electrode Placements. (a) The auditory oddball paradigm consisted of a sequence of pure sine wave tones of a given frequency (standard condition) randomly interleaved with a tone of a different frequency 10–20% of the time (oddball condition). Tones were placed 1.00–1.25 s apart, and oddball trials were pseudo-randomly varied between 6–9 s apart. Tones (1 kHz, 2 kHz) were counterbalanced across test blocks. LFP activity in pSTG and pupil diameter size were collected during task playback. (b) A total of 73 electrodes were identified in the pSTG regions of the auditory cortex and association areas.

Auditory salience responses show distinct peri- and post-stimulus spectral components

We first aimed to determine the spectrotemporal characteristics of sound responses and their relationship to pupil size. To do so, for each block of trials, we averaged wavelet spectrograms separately for standard (Fig. 2a) and oddball (Fig. 2b) trials within each electrode and then computed the average difference (oddball-minus-standard) in response (Fig. 2c; see Methods). Two distinct response components were evident: an early increase in high-frequency power during stimulus playback (0-250 ms), and a late suppression of low-frequency activity after stimulus offset (250–750 ms). We calculated the average percent power change from baseline for each electrode, per test block (N = 18) for each major frequency band (high gamma [70–150 Hz]; low gamma [32–70 Hz]; high beta [20–30 Hz]; low beta [12–20 Hz]; alpha [8–10 Hz]; theta [4–8 Hz]; delta [2–4 Hz]), in the early and late time windows, and quantified the magnitude and significance of these response components (Supplemental Fig. 1).

Fig. 2
figure 2

Auditory Cortical Salience Response Dynamics. Average spectral responses across all subjects for (a) standard, (b) oddball, and (c) salience detection (oddball – standard difference) conditions. (d) Between-subjects average % power change in each frequency band in the early (high gamma, low gamma) and late (high beta, low beta, alpha, theta, delta) time windows. (e) The salience detection component (oddball – standard difference) shows early increase and later suppression are larger for oddball compared to standard conditions. (f) Average pupil responses in oddball and standard conditions show (g) a significantly larger average response for oddball compared to standards, and a within-subjects salience detection component that is significantly different from zero. (h) Group-level correlations between pupil response and cortical response in the high gamma, low beta, and alpha frequency bands for the standard and oddball response conditions across all test blocks. *p < 0.05, **p < 0.01, ***p < 0.001.

For the early time window, there was a significant increase in power after the onset of the stimulus in both the standard and oddball conditions for each canonical frequency band, except for in the low beta frequency band for the oddball responses (Supplemental Table 2). In the late time window, power increased after stimulus onset for all frequency bands in the standard condition. The oddball condition was differentially affected; high gamma, low gamma, and delta showed an increase in power, while high beta, low beta, alpha, and theta did not show a statistically significant change (Supplemental Table 3). We saw statistically significant differences in the high frequency and low frequency in the early and late response windows, respectively (Fig. 2d). The early increase in high-frequency activity and the late post-stimulus suppression in low-frequency activity were significantly greater for oddball compared to standard trials in every canonical frequency band (Fig. 2e).

We found significant pupillary dilations to both oddball and standard tones (Standard: M(SD) = 0.54(5.92), t(df) = 3.27(1272), p = 0.001; Oddball: M(SD) = 2.83(5.62), t(df) = 8.56(288), p = 6.91e-16). The pupil dilation on oddball trials was larger than on standard trials (M(SD) = 2.63(1.48), t(df) = 5.03(7), p = 0.0015) (Fig. 2f and g), indicating a saliency response in the pupil.

Auditory cortical responses show time- and frequency-specific relationships to pupil responses

Cortical physiologic (Supplemental Fig. 2a) and pupillary responses (Supplemental Fig. 2b) exhibited substantial between-subject variability. The average high gamma maximum amplitude ranging between 57.91 and 199.27% change from baseline and peaking between 0.06 and 0.15 s post-stimulus, and average oddball pupil response amplitude between 2.04 and 11.88% change of maximum pupil size peaking between 0.88 and 1.64 s post-stimulus onset. We therefore examined the extent to which phasic pupillary responses tracked the between-subject differences in response components within the auditory cortex.

Consistent with prior observations that pupillary responses correlate with high gamma activity in the salience network42, and beta and alpha43,44,45 activity over extracranial temporoparietal regions, group-level correlational analyses (Fig. 2h) showed a significant positive relationship between pupil responses and high gamma responses (r = 0.57; p-adj = 0.046), and a significant negative relationship between pupil responses and low beta (r=-0.79; p-adj = 0.0002) and alpha (r=-0.56;p-adj = 0.046) responses in the oddball condition. No clear relationship was identified between pupil responses and cortical responses for the standard condition (r=-0.04-0.30; p-adj = 0.37–0.87, Fig. 2h). Thus, substantial between-subject difference in oddball responses in auditory cortex is accounted for by pupil-indexed brain state.

We next sought to examine the variability in responses at the single trial level. Similarly to our group-level analysis, a linear mixed-effects model (LME) revealed a significant condition-specific relationship between high gamma and pupil responses, within subjects. The initial high gamma LME was constructed with response size and condition as interacting fixed effects (see model notations and summaries in Table 1). This model revealed a significant interaction effect (p-adj.=0.007) between cortical response amplitude and condition (i.e. standard or oddball) in predicting pupil responses, with a non-significant main effect of cortical response amplitude (p-adj.=0.486) and trial condition (p-adj.=0.263). The lack of a main effect of condition was unexpected given the robust salience response seen between standard and oddball responses in high gamma and pupil responses (Fig. 2a-b). We therefore separated standard and oddball trials into separate LMEs (Table 1) to disentangle the significant interaction term found in the full model. We found that high gamma response amplitudes predicted pupil response amplitudes for oddball (p = 0.0013), but not standard (p = 0.390) trials, such that as high gamma response amplitudes were larger, pupil response amplitudes were also larger when considering oddballs. Model results were significant following Benjamini-Hochberg corrections for three high gamma models (Table 1). We found no significant relationship between low beta-frequency or alpha-frequency neural responses and pupil dilations at the trial level (Table 1).

Table 1 Pupillary readouts of auditory cortical saliency responses.

Auditory cortical response differences are not dependent on tone frequency or depression status

The auditory cortex is tonotopically organized46, and different stimuli may exhibit differing salience effects. Therefore, we examined whether cortical responses depended on the carrier frequency of the individual sounds used for standard and oddball stimuli (1000–2000 Hz, counterbalanced condition assignment, see Methods). We found no significant differences in cortical or pupillary response when trials were grouped by tone frequency instead of by condition (Supplemental Fig. 3, Supplemental Table 4).

Given that high rates of comorbid depression exist in our study cohort’s clinical population47, and the known disruptions in salience processing in regions of the salience network1,3,4,5 for depressed compared to non-depressed patient populations, we sought to compare response profiles between our depressed and non-depressed subjects. We found that depressed and non-depressed subjects exhibited a similar pattern of activation – an early increase in high-frequency activity and a later suppression in low-frequency activity which was larger for oddball compared to standard trials (Supplemental Fig. 4; Supplemental Tables 5–6), and that there was no significant difference in average response size for the high gamma, beta, or alpha frequency bands (Supplemental Fig. 5; Supplemental Table 7). Thus, neither stimulus specifics nor depression status confounded our results.

Aperiodic 1/f slope tracks fluctuations in tonic pupil diameter

We next aimed to determine if tonic pupil-indexed arousal state was reflected in activity in the auditory cortex and contributed to the variability in our phasic responses. To do so, we first aimed to establish if the ongoing cortical dynamics in the auditory regions tracked tonic pupil-indexed arousal state (Fig. 3a). More specifically, we investigated if aperiodic 1/f slope (Fig. 3b), which has recently been suggested to index arousal state41, tracked tonic pupil diameter fluctuations given by absolute pupil size (% Max), as opposed to the change from baseline, in the same time window (0–750 ms), and if the direction of this relationship aligned with our predictions (schematized in Fig. 3c). The average aperiodic 1/f slope estimate (Fig. 3b) in the auditory cortical regions of our subjects was (M(SD) = -2.93(0.31), Nelectrodes= 140) across all recorded electrodes. In line with our hypothesis, we found that as pupil diameter becomes larger, cortical activity reflected a lower proportion of low-frequency activity and a higher proportion of high-frequency activity. In terms of aperiodic 1/f slope, this corresponded to a positive linear relationship, where slope becomes flatter (less negative) as pupil diameter becomes larger (Table 2). LME analyses showed that delta power extracted from the auditory cortex was not a significant predictor of tonic pupil size (p-adj.=0.674), but aperiodic 1/f slope was a significant predictor (p-adj = 0.019), at the single trial level (Table 2).

Fig. 3
figure 3

Aperiodic 1/f slope as a measure of tonic state. (a) Diagram of synchronized pupil and cortical activity. (b) Average spectra and slope estimate per electrode, colored by subject. The aperiodic 1/f estimate is calculated by identifying the slope of a linear fit to a constrained range (20–45 Hz) of the log-log transformed power spectrum. (c) Predicted tonic pupil-to-slope arousal state relationship. At low levels of arousal state, pupil diameter is small and cortical membrane potentials exhibit a large proportion of low-frequency activity and a low proportion of high-frequency activity, which would be reflected as a steeper (more negative) aperiodic slope estimate. At high levels of arousal, pupil diameter is large and cortical membrane potentials exhibit a large proportion of high-frequency activity and a low proportion of low-frequency activity, which would present as a flatter (more positive) slope estimate.

Table 2 Aperiodic 1/f slope correlates with pupil diameter.

Auditory cortical high gamma response exhibits an inverted-U shaped dependence on 1/f slope

We next examined whether the phasic responses in the auditory cortex followed the well-characterized9 inverted-U shaped state-dependent relationship, using aperiodic 1/f slope as our primary measure of ongoing tonic cortical state. Because we found that only high gamma responses tracked phasic activity at the trial level (Table 1), we focused on the high gamma response component for all subsequent analyses. We compared two measures of tracking arousal state: delta power fraction9,36, as well as aperiodic 1/f slope. We binned the high gamma cortical response sizes by each measure of state (i.e. 1/f slope or delta power fraction) extracted immediately before the onset of the tone stimulus. High gamma exhibited an inverted-U shaped relationship with pre-stimulus arousal state for both delta fraction (adj-R-squaredstd=0.698, p = 0.012; adj-R-squaredodd=0.728, p = 0.0085; Supplemental Fig. 6a) and 1/f slope (adj-R-squaredstd=0.974, p = 4.50e-05; adj-R-squaredodd=0.511, p = 0.072; Fig. 4a); the oddball condition only showed an inverted-U relationship that was marginally significant.

Fig. 4
figure 4

Inverted-U-shaped dependence of phasic responses on 1/f slope. (a) High gamma phasic auditory responses show an inverted-u-shaped state-dependent relationship with aperiodic 1/f slope. (b) Salience detection responses (oddball – standard differences) show and inverted-u-shaped relationship with aperiodic 1/f slope. (c) Pupil responses show a condition-specific inverted-u-shaped relationship with aperiodic 1/f slope for oddball trials, and a linear relationship for standard trials. ttp<0.10, *p < 0.05, **p < 0.01, ***p < 0.001.

To further resolve these group-level relationships, we examined responses at the single-trial level for standard and oddball trials, utilizing both measures of tonic state. LME models showed a significant inverted quadratic (i.e. inverted-u) fit (p-adj.slope<0.0001,p-adj.slope−squared=p < 0.0001) for aperiodic 1/f slope predicting high gamma response amplitude. An additional main effect of stimulus condition (p-adj.condition<0.0001), and interaction effects between condition and the quadratic fit for slope (p-adj.slope*condition<0.0001,p-adj.slope−squared*condition=0.0061), suggest that the oddball condition shows a larger response and a steeper inverted-U shaped relationship with slope compared to the standard condition (Table 3). High Gamma phasic responses did not show a significant relationship with delta power fraction for standard or oddball trials (Table 3).

Table 3 Dependence of High Gamma Phasic Response on 1/f slope.

States with intermediate 1/f slope have maximal saliency responses in auditory cortex

The inverted-U shaped effect seen for high gamma phasic responses and tonic 1/f at the group level, combined with the interaction effect with stimulus condition at the trial level, seemed to indicate the largest differences between standard and oddball datasets occurred at intermediate slope values. To investigate this effect further, we computed the within-electrode high gamma salience detection component (oddball – standard response; see Methods) and binned these responses by delta power fraction (Supplemental Fig. 6b) and 1/f slope (Fig. 4b). We found a similar inverted-U shaped relationship for this saliency component (adj-R-squareddiff =0.98 ; p = 0.012) with 1/f, but not with delta power fraction.

Aperiodic 1/f slope exhibits an inverted-U shaped relationship with phasic pupil responses

Since phasic pupillary responses were tracked by phasic high gamma responses at the trial level, we also aimed to examine if event-locked pupillary responses, given by the average pupil dilation after trial onset (0–2,500 ms), were reflective of the known inverted-U shaped relationship with tonic arousal state when this tonic state measure was extracted from the auditory cortex. We performed congruent analyses to those examining cortical responses (i.e. group-level binning followed by LME fitting). We saw no clear relationship between pupil response and pre-stimulus auditory cortical arousal state for either state metric (Supplemental Fig. 7a-b). While the group-level standard condition did show a significant fit (adj-R-squared = 0.974,p < 0.0001), this was not seen at the single-trial level (see Supplemental Table 8). Given that pupil responses are known to lag behind cortical responses9,34, we hypothesized that the response window for cortical activity (0-750ms) in the auditory cortex may be more reflective of the tonic state that is driving pupillary response differences. To test this, we binned pupil responses by post-stimulus (0-750ms) state values and performed different fits for each state metric. While the aperiodic slope showed an inverted-U shaped relationship with pupil responses for oddball (adj-R2 = 0.89, p-adj.=0.032; Fig. 4c), it showed a positive linear relationship in the standard condition (adj-R2 = 0.71, p-adj.=0.032; Fig. 4c), both of which generalized to the single-trial level within subjects (See Table 4). Delta fraction showed a non-significant inverted-quadratic relationship with pupil responses for the standard condition (p-adj.=0.0760; Supplemental Fig. 6c) and a non-significant u-shaped relationship in the oddball condition (p-adj.=0.0573; Supplemental Fig. 6c).

Table 4 Relationship of Pupil Response magnitude to 1/f Slope.

Aperiodic 1/f slope may reflect changes in tonic state in response to phasic salience responses

Finally, we sought to determine if salient stimuli resulted in phasic state changes, as indicated by our aperiodic 1/f slope measure. To do so, we compared 1/f slope between pre-stimulus, post-early, and post-late trial periods (Fig. 5a). We also sought to determine if this shift in slope was associated with trial-to-trial variability in phasic response size. Pre-stimulus aperiodic 1/f slope was not significantly different between standard and oddball trials (t(84) = -0.358, p = 0.943; Fig. 5b), and both conditions exhibited significant flattening of the slope in the early time window (STD: t(84) = 5.95, p-adj. < 0.0001; ODD: t(84) = 8.76, p-adj. <0.0001) which was also not different between the two conditions (t(84) = 2.45, p-adj. = 0.098; Fig. 5b). In the standard condition, 1/f slope estimate returned to the pre-stimulus slope estimate by the late time window (250–750 ms; t(84) = 1.49, p-adj. = 0.424), but the late slope estimate in the oddball condition maintained a flatter slope than the pre-stimulus period (t(84) = 7.03, p-adj. <0.0001) and was not different from the early post-stimulus window slope (t(84) = 1.73, p-adj. = 0.351; Fig. 5b).

Fig. 5
figure 5

Prolonged alteration of 1/f slope with stimulus saliency. (a) Average smoothed pre-, post-early, and post-late power spectra for standard and oddball trials within the constrained frequency range (20–45 Hz) show different high: low frequency balances between trials. (b) Both standard and oddball responses show significant flattening of the aperiodic estimate early after stimulus playback, but this change is only maintained into the late post-stimulus time window for salient oddball trials.

To examine the effects of this prolonged slope shift on the following trial, we extracted the standard trial responses that occurred immediately after the onset of the oddball trials and compared those responses to standard responses after other standards. As expected, oddball responses (M(SD)Odd = 67.83(47.18)) were significantly larger than both standard-after-standard (M(SD)Std=35.64(34.02); t(370) = 10.94, p < 0.0001) and standard-after-oddball responses (M(SD)AO=41.29(36.46);t(543) = -7.53, p < 0.0001). However, the standard-after-oddball (‘After Oddball’; (AO)) high gamma response size across all subjects showed a larger % signal change from pre-stimulus baseline than standard responses that came after other standard responses (t(402) = 2.35, p-adj = 0.019); Supplemental Fig. 8). This result suggests that momentary state changes trigged by salient stimuli lead to increased response to subsequent stimuli.

Discussion

The goal of this study was to evaluate aperiodic 1/f slope as a measure of the impacts of neuromodulatory brain state on saliency processing in auditory cortex. This was accomplished in three main steps: (1) determine if 1/f slope is correlated with tonic pupil diameter, which has been previously shown to track neuromodulatory brain state and RAS activity32,33,34,35; (2) determine how phasic responses in the auditory cortex are affected by pre-stimulus 1/f slope; and (3) determine if the 1/f slope changes after salient versus non-salient stimuli, and if these changes correlate with altered cortical sound responses on subsequent trials. We found that aperiodic 1/f slope shows the predicted positive linear relationship with tonic pupil diameter, exhibits an inverted-U shaped relationship with phasic sound responses, and is persistently altered after salient, but not non-salient, stimuli. Thus, aperiodic 1/f slope shows promise as a candidate for tracking impacts of arousal state for several reasons.

Our work builds on previous experimental results that have established that pupil dilation responses can track neural responses to salient stimuli in fMRI48, scalp EEG43, and iEEG42 measurements, with some reports showing the relationship at the single trial level. Mechanistic research on arousal state has identified a tonic mode of low-frequency activity in RAS and a phasic mode of high-frequency bursting activity driven by responses to salient stimuli in, for example the LC22,23. Both tonic and phasic modes are reflected in pupil diameter fluctuations27,32,33,34,38. Low levels of tonic RAS activity (and associated small pupil size) correspond to low-frequency cortical activity9,27,37,49. These patterns may be important to understanding both healthy and pathological variability in phasic response sizes to incoming stimuli from the environment50,51,52.

We focused on high gamma responses because their trial-wise variation was tracked by variation in pupil dilation responses. We focused on 1/f slope as a state measure because of growing interest in the literature39,41,53 and because it outperformed other measures that we considered. In particular, low frequency activity did not exhibit consistent relationships to pupil or neural saliency responses. This may be due, in part, to the lower temporal resolution for tracking changes in low-frequency activity and definitional variability between studies in high-to-low frequency ratio measures24,36. Accordingly, while we recapitulated inverted-U shaped state relationships with the low-frequency activity measure at the group level, only aperiodic 1/f slope showed significant relationships with both tonic and phasic pupil and cortical activity at the single trial level.

A related limitation of our findings is the short time windows used to quantify delta activity. The delta power estimate was calculated over a window of 250 ms, and thus, may average information that also resides outside the designated time window. This study’s task was structured such that the inter-stimulus interval is 1.25–1.50 s following the offset of the previous stimulus, in order to efficiently collect oddball responses during the precious time of intracranial recording in humans. As a result, averaging over activity outside the designated windows may contaminate the delta power estimate with either the previous or following stimulus activity. Accordingly, the worse performance that we saw for delta power as a measure of tonic state may be due to this contamination.

While high gamma responses showed a shallow inverted-U shaped relationship in the standard condition, pupil diameter showed a linear relationship. This divergence in the tonic state-phasic response pattern may be another limitation imposed by this study’s inter-trial interval, because the pupil diameter response did not have sufficient time to fully return to the tonic baseline size; this effect greatly decreased the response amplitude, and its variability, for standard trials. Thus, the condition dependence we see for the high-gamma phasic response and pupil response relationship may simply be due to the superposition of responses to consecutive standard stimuli. Furthermore, since the response amplitudes and variability are so small in the standard condition, and tonic pupil diameter is correlated with tonic state (1/f slope), it is possible that the linear relationship we see between aperiodic slope and pupil response is simply reflecting the baseline relationship because of the small superimposed amplitudes. Future research interested in the phasic pupil relationship with cortical salience response should extend the inter-stimulus interval to allow each response to return to a true baseline pupil size for a more precise and accurate read of the LC-related phasic pupillary response amplitudes. A further caveat is that pupil fluctuations are also known to be associated with cognitive factors other than arousal state54, so future studies should consider this when identifying the types of tasks and stimuli they will use to elicit the phasic responses.

The final aim of this study was to identify if our proposed 1/f measure of state could capture momentary shifts in arousal induced by salient stimuli. Recent work shows that 1/f slope can change in response to auditory stimuli on a trial-to-trial basis55. Similarly, average salience responses in the current study showed an increase in high-frequency activity and decrease in low-frequency activity, which corresponded to a flattening of the aperiodic 1/f slope between the pre- and post-stimulus time windows. While the magnitude of this shift was not different between standard and oddball conditions in the early window, the shift in slope for the oddball condition was prolonged compared to the standard condition. We further show that the high gamma phasic responses to standard stimuli that come immediately after oddball stimuli are larger in magnitude than the standard stimuli responses after other standards, suggesting that this prolonged shift may be tracking a change in localized state. Future studies should aim to examine this shift in slope over longer periods and in response to ecologically relevant stimuli, which may produce a larger and more sustained effect than pure sine wave tones.

Taken together, our findings suggest that aperiodic 1/f slope captures the effects of neuromodulatory state at fine spatiotemporal resolution. Moreover, our results collectively align with the adaptive gain theory and may inform the understanding of salience processing disruptions in arousal-related disorders, such as depression. While we did not find significant differences in average auditory responses in the auditory cortex between our small number of depressed compressed non-depressed patients in the current study, future studies could examine dependence of phasic responses on tonic state in larger cohorts and downstream regions with known response disruptions1,3,4,5 to identify the impacts of brain state on altered processing in those disorders.

Methods

Participants

The study was approved by the Institutional Review Board at Baylor College of Medicine (IRB: H-18112), in compliance with the international standard of the Declaration of Helsinki. All participants provided written informed consent. Participants consisted of 10 patients (Nfemale=5; mean age = 41 years; Supplemental Table 1) undergoing intracranial stereo-EEG (sEEG) monitoring for medically refractory epilepsy evaluation. sEEG probes were placed based on clinical criteria.

Auditory oddball paradigm

Participants completed an active ‘count’ auditory oddball (AOB) task in which they mentally counted the number of ‘oddball’ tones they heard. The count was reported verbally after each experimental block. Stimuli were 250-ms long pure tones at two frequencies (1 and 2 kHz) representing the standard and oddball sounds (Fig. 1a). The assignment of tone frequency to condition (standard or oddball) was counterbalanced across blocks within each participant. Each block consisted of 160 total trials. A trial included the auditory presentation of the tone followed by inter-trial intervals (ITIs) randomly varying between 1.0 and 1.2 s. The first ten trials of each block were always standard stimuli. The remaining 150 trials consisted of 80–85% standard tones with 15–20% pseudo-randomly interleaved oddball tones, which resulted in approximately 128–136 standard trials and 24–32 oddball trials per test block. The pseudo-randomization for oddball stimuli was necessary to ensure that there would be enough time to allow the pupil to return to baseline before presenting another oddball trial (6.0–9.0 s inter-oddball interval). Each block lasted approximately 3–4 min and each participant completed 2–7 blocks per session. Sounds were played binaurally from speakers situated directly behind the patient at an intensity adjusted based on individual comfort (~ 68–82 dB).

Electrode localization

FSL was employed to align post-implantation CT brain scans from each patient, showing the location of the intracranial electrodes, to their pre-operative structural T1 MRI scans. We used FreeSurfer to reconstruct the brain surface from the MRI volume. BioImage Suite 35 and iELVis56 were used to obtain electrode coordinates and determine the position of the electrode with respect to anatomical landmarks. The anatomical assignment was based on the proximity to the cortical surface and labeled based on the Destrieux cortical parcellation atlas, which was then confirmed based on independent expert visual inspection. Electrodes were included if they were located within 5 mm of the grey matter boundary of the posterior Supratemporal Gyrus (pSTG). The auditory cortex was identified based on major sulci along the supratemporal gyrus (pSTG)57,58,59. More specifically, ROIs were within Heschl’s Gyrus bounded by the anterior temporal sulcus, through the second transverse gyrus bounded posteriorly by Heschl’s sulcus (Fig. 1b). A total of 80 electrodes were identified in the ROI. After the exclusion of electrodes contaminated by noise or artifacts, 73 electrodes remained for analyses.

Pupil recording and pre-processing

Pupil diameter was recorded using an EyeLink 1000 system (SR-Research, Osgoode, ON, Canada) while participants fixated on a cross presented at the center of a display monitor (Viewsonic VP150, 1920 × 1080) positioned at a comfortable distance (22–24 in) during the AOB task. Each block of data was sampled at 1,000 Hz (n = 1, Desktop Mode) or 500 Hz (n = 9, Remote Mode) after the successful completion of a 5-point calibration, validation, and single-point drift correction paradigm.

Blinks and artifacts were isolated using the findpeaks.m60 function in MATLAB (R2021a) and confirmed by visual inspection. Peaks were identified as maximal points in the pupil diameter Z-score time series with a minimum peak height of 10 S.D., and a minimum distance of 50 samples between peaks. Peak points were padded by 200 ms and linearly interpolated. Due to the minimum peak distance parameter, not all blinks or artifacts were detected in just one iteration of the pipeline. The findpeaks.m function was used until it could no longer detect any peaks, indicating that all artifacts had been eliminated. This was confirmed via visual inspection on each iteration, with a maximum of 7 iterations of the artifact elimination pipeline for a single patient. Two patients were eliminated due to excessive blinks and artifacts.

The pupil time series was down-sampled to 500 Hz and low-pass filtered at 50 Hz. Trial onsets were identified using event markers sent from the computer generating the standard and oddball sequence to the EyeLink 1000 system. The pupil time series was epoched between 1,000 ms before and 3,000 ms after sound onset. Maximum pupil size values were identified for each run per patient, and time series were normalized to the percentage of the maximum pupil size (% of max) to standardize the time series across patients and runs. Pupil responses used to quantify phasic pupil activity were calculated as the difference between the average % of max in the response window (first 2,500 ms) versus the baseline window. Non-event-locked pupil size used to quantify tonic pupillary activity was calculated as the % of max pupil size within the same time window, but not baseline corrected to reflect the change of the response. Single trials were eliminated if they contained greater than 40% interpolated values, or response average values were greater than or equal to five median absolute deviations from other epochs at that sample location within the task block (see Supplemental Table 1 for final trial counts across all runs electrodes within each condition).

Intracranial EEG recording and preprocessing

Neural signals were recorded from stereo EEG probes (Ad-Tech Medical Instrument Corporation or PMT Corporation) connected to a Cerebus data acquisition system (Blackrock Neurotech). All recording signals were amplified, filtered (high-pass 0.3 Hz first-order Butterworth, low-pass 500 Hz fourth-order Butterworth), and digitized at 2,000 Hz. Acoustic stimulation was synchronized with neural signals using an audio analog input which copied the sound waveform to the Cerebus system.

Visual inspection was used to exclude recordings from electrodes with excessive noise, artifacts, and/or interictal activity. Recordings from included electrodes were then re-referenced using a common average reference of all valid electrodes, and notch filtered to remove line noise (60 Hz, first harmonic, second harmonic, and third harmonic). Finally, all recordings were downsampled to 500 Hz using a low-pass Chebyshev type 1 IIR filter of order 8.

The signals obtained were transformed from the time domain into the frequency domain using a wavelet transformation (Morlet, 7 cycles per wavelet; frequencies equally spaced on a linear scale from 2 to 200 Hz). This provided an electrode-by-time matrix of power values for each data run. The analog recorded acoustic signal was used to identify sound onsets. The neural signals of interest (from the electrode-by-time matrix) were obtained by extracting values for each trial (between 1,000 ms before and 2,000 ms after sound onset). Trial power values were normalized to a baseline period (-250 ms to 0 ms; preceding stimulus onset) by calculating % signal change from baseline for the early intra-stimulus (0 ms to 250 ms) and late (250 ms to 750 ms)61 post-stimulus time windows. We compute percent power change across canonical neural frequency bands (High Gamma [70–150 Hz]; Low Gamma [32–70 Hz]; High Beta [20–30 Hz]; Low Beta [12–20 Hz]; Alpha [8–10 Hz]; Theta [4–8 Hz]; Delta [2–4 Hz]) when investigating the shift in aperiodic 1/f slope between pre-and post-stimulus periods. We focus on our apriori hypotheses surrounding high gamma, low beta, and alpha bands in the correlational and linear mixed-effects modeling analyses. Cortical arousal state was quantified in the pre (-250–0 ms) and post (0–750 ms) stimulus time windows using the delta power fraction (delta power / total power), or the aperiodic (1/f) slope estimate, which was calculated using a linear fit (MATLAB function polyfit.m, degree 1, MathWorks™ 2023b) over the 20–45 Hz spectral range. The intermediate frequencies (20–45 Hz) reflect the region of transition from high-to-low frequencies, such that increased delta power and decreased gamma power will have a steep slope in the transition region, whereas low delta power and increased gamma power should have a flatter slope62. The range restriction is also recommended by Gerster et al. (2022), as restriction of the range for the 1/f slope fitting shows a higher signal-to-noise ratio63.

Statistical analyses

Pupil response profiles were assessed at the group level between subjects, and the trial level within subjects. Pupil diameter responses and cortical responses for both conditions were calculated for each task block in each canonical frequency band in terms of % power change for cortical responses, or % of max for pupil responses. A Pearson correlation was conducted for each LFP response component-condition pair, and Benjamini-Hochberg correction was performed for a total of six correlational analyses. Linear mixed-effects (LME) modeling was used to examine the trial-by-trial relationship within subjects for the significant group-level relationships (high gamma, low beta, alpha). Average high gamma, low beta, and alpha component responses were extracted for each electrode per trial and then averaged over all electrodes per trial, such that there was one value per trial, per test block. The LME was constructed with pupil response as the dependent variable, cortical response (high gamma, low beta, or alpha) and condition (standard vs. oddball) as interacting fixed effects, and subject ID as a random effect. Benjamini-Hochberg correction was performed for a total of three high gamma LMEs (see Table 1 for model notations). All LME modeling was completed using the lmer package in R, notations provided in results tables reflect lmer model notations.

The dependence of phasic responses on tonic state was assessed at the group level between subjects, and the trial level within subjects. Trials were averaged over all electrodes such that each task block only had one value per trial. Pre-stimulus slope or delta fraction (delta fraction = delta power/total power) was computed in the baseline (-250–0 ms) window for each trial. These values were then sorted into evenly spaced bins per condition, except for the extremes which were extended to ensure adequate bin size (Nbin > 5, bin cutoffs for 1/f slope = [-8.5, -5.5,-5.0,-4.5,-4.0,-3.5,-3.0,-2.5,-2,-2.5] and delta fraction = [0, 0.05, 0.1, 0.15, 0.20, 0.25, 0.3, 0.35, 0.40, 0.45, 0.5]). Average pre-stimulus slope values per task block were examined using the MATLAB isoutlier.m function. One subject (Subject 10) was identified as an outlier for both task blocks (falling greater than three median absolute deviations from the group mean) and was excluded from group analyses but included in within-subjects trial-level analyses. A regression line was fit to the binned dataset using the MATLAB lmfit.m function. A quadratic (high gamma responses, pupil-oddball responses) or linear (pupil-standard responses) depending on which fit was most appropriate for each relationship based on R-squared values. Salience detection component relationships were assessed by calculating the average difference between oddball and standard responses per electrode, then binned by the averaged pre-stimulus slope value, and fit with a quadratic model. Cortical trial-level analyses were conducted using LME modeling within subjects per electrode per subject. Pupil trial-level analyses also utilized LME modeling, though (pre- or post-stimulus) slope values were averaged over all pSTG electrodes per task block for each trial and only fit per subject to avoid pupil response redundancy.

Cortical response profiles were assessed at the group level. Epoched spectra were averaged over all trials for each condition (standard vs. oddball) for a given task block, and early (0–250 ms) and late (250–750 ms) time windows were extracted for each canonical frequency band. Wilcoxon signed-rank tests were first used to assess if each condition-frequency band pair exhibited a significant change from zero, and Benjamini-Hochberg correction was used to adjust for 14 total tests per time window. Then average oddball and average standard responses for each time window were matched by task block and the difference was computed for each pair (Oddball amplitude – Standard amplitude) for each canonical frequency band. Wilcoxon signed-rank tests were again used to assess if each band amplitude was different from zero, and Benjamini-Hochberg correction was used for seven total tests for each time window. This method was also used for assessing the effect of tone frequency (1,000 vs. 2,000 Hz) rather than salience condition, and a Kruskal-Wallis test was used to compare group means. The comparison between depressed and non-depressed cohorts was averaged at the level of electrodes, such that all trials of one condition were averaged within each electrode, and then all electrode values were pooled across subjects. Student’s t-tests were utilized to compare changes in frequency bands, and the remaining analyses (corrections for multiple comparisons were congruent with the pooled dataset.

Pre-, post-early, and post-late stimulus slope estimates were averaged over all trials per condition within a given task block. An LME was fit with aperiodic slope estimate as the dependent variable, time window (pre, post-early, post-late) and condition as interacting fixed effects, and subject ID as a random effect. Estimated Marginal Means (EMMs) were computed using the emmeans64 function in R, and pairwise comparisons were made between each condition-time window pair. Corrections were made using the Benjamini-Hochberg method. After-oddball responses were identified, and an LME was fit with high gamma responses as the dependent variable and condition (standard, oddball, after oddball) as the dependent variable, and subject ID as a random effect. EMMs were computed, and pairwise comparisons were made across condition.