Introduction

Sound, intrinsically bound to the temporal domain, necessitates temporal integration for a coherent perception. Traditional auditory research has predominantly concentrated on the frequency domain, in which the auditory system is tonotopically organized to segregate and process distinct frequencies along the auditory pathway1. Nonetheless, the critical importance of the temporal dimension in auditory processing cannot be overstated. Temporal elements, including rhythm, timing, and the recognition of complex sound patterns, are fundamentally reliant on temporal cues2. This temporal aspect underpins essential facets of speech and music perception, alongside the discrimination of environmental sounds3. An auditory object—a perceptual unit formed through integration of spectral, temporal, and spatial cues that the brain identifies as distinct from background sounds—emerges from the integration process. Auditory objects may span milliseconds to seconds (e.g., a spoken syllable, a musical note, or a complete phrase), and they exhibit hierarchical organization across different timescales. For instance, temporal integration in humans spans various timescales4, as demonstrated in oral language processing where the brain aligns with syllables, phrases, and sentences at differing temporal scales5,6. A related but distinct process is auditory streaming, which refers to the brain’s ability to segregate different sound sources over time, forming separate perceptual streams. While auditory streaming explains how the brain organizes sequential sounds into distinct perceptual streams based on acoustic cues such as frequency or spatial location7,8,9, temporal integration accounts for the neural process by which temporally proximal (discrete) auditory events are fused into a coherent perceptual unit. Despite its crucial role in auditory perception, the neural mechanisms underlying temporal integration—particularly the neural signatures related to the fusion process—remain poorly understood.

Addressing this gap necessitates the generation of sounds with distinct components to isolate the holistic response from those elicited by individual elements. Click trains, comprised of uniform pulses that vary only in temporal spacing, serve as an exemplary stimulus for delving into temporal auditory processing. Despite their prevalent use in auditory research, click trains have rarely been studied for their holistic representation, as most research has focused on encoding individual clicks10,11,12. However, psychological studies have indicated that when the inter-click interval (ICI) of a click train is small (< 33 ms), it can be perceived as a continuous pitch-like sound13,14, suggesting of temporal integration during the perception of the click train. This indicates that a click train, possessing a long-timescale structure but composed of individual clicks with short-timescale characteristics, can be perceived as a coherent auditory pitch rather than a collection of individual elements.

To investigate temporal integration in a controlled manner, we designed a novel “transitional click train” paradigm. A transitional click train consists of two consecutive segments of click sounds, where the ICI differs slightly between the first and second segments. The first segment establishes a stable temporal pattern (potentially yielding a perceived pitch), and the second segment introduces a new pitch, creating a temporal change midway through the sound. If the auditory system integrates the first segment into a single auditory object (e.g., a continuous pitch), then the transition to the new ICI would be perceived as a salient change—essentially marking the emergence of a new auditory object. This paradigm is particularly advantageous because it probes the brain’s ability to detect internal changes in an ongoing sound, rather than merely responding to the abrupt onset of a new sound. Moreover, when ICIs are in the millisecond range, prolonged stimulation leads to cortical adaptation following the initial onset response, such that individual clicks no longer evoke discernible responses10,15. The transitional click train leverages this adaptation, maintaining consistent stimulation while introducing a perceptual shift—thereby isolating the neural signature associated with the perceptual transition between auditory objects.

Crucially, temporal integration is not only foundational for basic auditory cognition, but is also increasingly recognized as a marker of conscious sensory processing and higher cognitive functions. Recent studies indicate that disruptions in temporal integration are linked to disorders of consciousness16,17 and various psychiatric conditions, including schizophrenia18, autism19, ADHD, and Parkinson’s disease20. Neurophysiological work has further demonstrated that processing of temporal structure is impaired in patients with disordered consciousness21,22. Accurate assessment of residual consciousness is critical for prognosis and clinical decision-making; however, behavioral assessments are often unreliable and prone to misdiagnosis. Neurophysiological methods offer objective tools to detect “covert cognition” and improve diagnostic accuracy. Despite this potential, few paradigms currently provide temporally precise, non-invasive markers of temporal integration that are well-suited for clinical applications.

In our study, we employed the “transitional click train” to explore the neural correlates of temporal integration within click trains. Our results demonstrated that the human brain indeed exhibited a specific change response closely linked to temporal integration when exposed to transitional click trains. This discovery provides an important neural indicator of how the auditory system constructs and modifies auditory objects over time. Furthermore, our findings open new directions for auditory research, including potential clinical applications, as deficits in temporal integration are known to be associated with various neurological and psychiatric disorders.

Results

Temporal merging into auditory objects with click trains

The human brain seamlessly integrates discrete sounds into a unified perceptual experience when the interval between these sounds is exceptionally short. For example, in the case of click trains, individuals cannot discern the gaps between clicks when the inter-click interval (ICI) falls below 29.6 ms23 (gap detection task in Experiment 1; Supplementary Fig. 1). This phenomenon is an indication of temporal integration, whereby separate clicks merge into a single auditory object experienced psychologically. This process, where sounds with small auditory gaps integrate into a singular auditory object, is what we termed “temporal merging”. To investigate the neural representation of this temporal integration, we devised an experimental protocol featuring two types of click trains. The first type is a regular click train, characterized by uniform ICIs (the top row of Fig. 1a). The second type is an irregular click train, marked by variable ICIs (the bottom row of Fig. 1a). These timing variations disrupt the formation of a unified temporal pitch, thus providing a comparison condition to assess whether rhythmic consistency is essential for temporal merging. We created transitional trains by linking two 1-s regular click trains with different ICIs, one at 4 ms (train 1) and the other at 4.06 ms (train 2), and referred to these as Reg4-4.06 (the top row of Fig. 1b). Similarly, we generated transitional trains using irregular click trains with two distinct average ICIs, again one at 4 ms (train 1) and the other at 4.06 ms (train 2), and labeled these as Irreg4-4.06 (the bottom row of Fig. 1b). Unless otherwise specified, the standard deviation of the irregular click train (e.g., Irreg4-4 and Irreg4-8) is set to half of the mean value (μ/2). The two types of transitional trains were randomly presented to participants. For illustration, representative audio files of Reg4-4.06 and Irreg4-4.06 are provided as supplementary materials.

Fig. 1: Change responses in transitional click trains.
figure 1

a Illustration of a regular click train (above) and an irregular click train (below), both comprising 0.2-ms pulses with specific ICI, where regular click trains maintain a constant ICI and the ICI of the irregular click train is randomized following a Gaussian distribution. b Setup of the transitional click train showing a regular transitional click train beginning with a 1 s regular click train at 4 ms ICI, transitioning into a 1-s regular click train at 4.06-ms ICI, denoted as Reg4-4.06, and an irregular group where regular click trains are replaced by irregular ones while keeping the average ICIs equal (standard deviation: μ/2, where μ is the mean ICI value), denoted as Irreg4-4.06, with the black pulse indicating the common pulse shared by both trains and the green dashed line marking the transition point. c Grand-averaged responses at POz to Reg4-4.06 (red) and Irreg4-4.06 (black) across 42 subjects from Session 1 of Experiment 2. Shaded areas around curves represent \(\pm\) standard error of the mean (SEM). Orange vertical dashed lines indicate onset, change, and offset of the stimulus, from left to right. The yellow vertical bars represent significant difference between the two conditions (p < 0.05, permutation test). d A magnified view from (c) focusing on the window from -200 to 400 ms relative to the change point in the transitional click train. e Scatterplots of the relative response magnitude (RM) of change responses during 74-251 ms after change for each subject under Reg4-4.06 and Irreg4-4.06, showing significantly larger change responses in the regular group compared to the irregular group (p < 0.001, two-tailed paired t-test). Topographic maps (top-left: Reg4-4.06; bottom-right: Irreg4-4.06) show RM distributions across the scalp, with dots indicating electrodes with significant change responses (p < 0.05, two-tailed one-sample t-test, FDR corrected). f Source reconstruction of change response. Color scale shows the relative difference in source power between Reg4-4.06 and Reg4-4 within [0, 300] ms window after change, averaged across 42 subjects. Only significant areas are colored (p < 0.05, cluster-based permutation test). g Similar to (f) but for Irreg4-4.06 and Irreg4-4, with no area with significant difference found.

At the example electrode POz (Fig. 1c, d), a significant difference (p < 0.05, permutation test, 333−409 ms relative to the onset) emerged between the regular and irregular trains following the onset response, converging before the change point. The change responses within the time windows (92 to 148 ms and 163 to 280 ms relative to the change point) were significantly stronger for Reg4-4.06 than for Irreg4-4.06 (n = 42 from Session 1 of Experiment 2, p < 0.05, permutation test). The change response to Reg4-4.06 in the temporal-parietal-occipital scalp regions consisted of a positive component peaking around 120 ms after the change, followed by a negative component peaking around 200 ms (Fig. 1d). To quantify this change response across the entire scalp, we designated the positive component as “change P1” (cP1) and the negative component as “change N2” (cN2), and calculated the root mean square (RMS) over 74 to 251 ms following the change point for each channel, covering the combined cP1-cN2 responses (yellow vertical bar, Supplementary Fig. 2c). Furthermore, among the 42 subjects, the majority exhibited higher response amplitudes in the time window of the change response for Reg4-4.06 compared to Irreg4-4.06 (t(41) = 6.63, p < 0.001, d = 1.02, paired t-test; Fig. 1e), highlighting the differential impact of regular versus irregular transitional trains on auditory processing. Across all the channels of the 42 subjects, pronounced change responses to Reg4-4.06 were observed in the temporal, parietal, and occipital electrodes (Supplementary Fig. 2a, c). In contrast, the response to Irreg4-4.06 primarily manifested as an onset response, with a weak or indistinct change response observed (Supplementary Fig. 2b, d). Source analysis further indicated that the origin of the change response was predominantly localized within the auditory cortex (Fig. 1f, g).

To ascertain whether the observed change response stemmed from true temporal integration or was merely a reaction to the transient change of ICIs, we systematically replaced subsets of the original 4 ms ICIs in the ongoing click train with 4.06 ms ICIs (1, 2, 4, 8, 16, or 32 replacements; Session 2 of Experiment 2; Fig. 2a). Crucially, these modified intervals were not silent but instead slightly lengthened ICIs, preserving continuous acoustic stimulation while altering temporal pattern. These modified click trains were compared to responses elicited by the Reg4-4.06 transitional train and continuous the Reg4-4 click train (fixed 4 ms ICIs). Remarkably, the introduction of a single interval failed to produce any noticeable change responses (t(23) = 0.52, p = 0.68, d = 0.09, BF10 = 0.23, one-sample t-test), as evidenced in Fig. 2b, c. It was not until the insertion of 16 intervals that the change response saturated to the level of Reg4-4.06 (Fig. 2b, c). Thus, the change response most likely corresponds to a perceptual shift between two pitches, coalesced through temporal merging.

Fig. 2: Change responses depend on temporal integration.
figure 2

a Experimental setup (Session 2 of Experiment 2). 1, 2, 4, 8, 16, and 32 intervals (not silent), each measuring 4.06 ms, were interspersed within a click train maintaining a 4 ms ICI. These modified click trains were then juxtaposed against the responses elicited by both the Reg4-4.06 transitional trains and a continuous click train with a 4 ms ICI. b Grand-averaged change responses (at POz) to different numbers of inserted clicks. Shaded areas represent \(\pm\) SEM (n = 24). c Relative RM of change responses as a function of the number of inserted clicks, with the RM time window defined based on the Reg4-4.06 condition. Boxplots show group-level RM values (mean: central line; standard error: box edges; 10th-90th percentiles: whiskers). Transparent dots represent individual participants. ‘*’ indicates significant change responses (p < 0.05, two-tailed one-sample t-test) and ‘#’ indicates significant difference from Reg4-4.06 (p < 0.05, two-tailed paired t-test). Topographic maps (top) show RM distributions across the scalp, with dots indicating electrodes with significant change responses (p < 0.05, two-tailed one-sample t-test, FDR corrected).

Factors affecting change response

Having established the link between the change response and temporal merging, we endeavored to delineate the temporal boundary for integration, we systematically varied the ICI length while maintaining a fixed ratio between click train 1 and click train 2 (Session 3 of Experiment 2; Fig. 3a). Our goal was to identify at what ICI length the change response diminishes, indicating a failure of temporal integration. We found that an increase in ICI resulted in a diminution of the change response (n = 42; Fig. 3b), with the response magnitude being inversely proportional to the ICI length (Fig. 3c). Specifically, a strong change response was elicited by both Reg4-4.06 (t(41) = 6.31, p < 0.001, d = 0.97, one-sample t-test) and Reg8-8.12 (t(41) = 5.54, p < 0.001, d = 0.86, one-sample t-test). In contrast, configurations with longer ICIs, such as Reg32-32.48, failed to produce detectable change responses (t(41) = 0.89, p = 0.38, d = 0.14, BF10 = 0.24, one-sample t-test). While Reg16-16.24 did induce a significant change response in the occipital lobe (Supplementary Fig. 3), the overall average did not show a significant change response (t(41) = 0.16, p = 0.87, d = 0.03, BF10 = 0.17, one-sample t-test; Fig. 3c). These findings suggest that the upper limit for the perceptual integration of individual clicks is between 16 and 32 ms, at least under the current experimental conditions.

Fig. 3: The effect of three factors on change responses in transitional train.
figure 3

ac Effect of ICI length (Session 3, Experiment 2). a Stimulation setup: Four ICI combinations in regular transitional trains (4-4.06, 8-8.12, 16-16.24, 32-32.48 ms). b Grand-averaged change responses at POz across 42 participants. Shaded areas represent \(\pm\) SEM. c Relative RM as a function of the ICI of click train 1. c Boxplots show group-level RM values (mean: central line; standard error: box edges; 10th–90th percentiles: whiskers). Transparent dots represent individual participants. Asterisks (*) indicates significant change responses (p < 0.05, two-tailed one-sample t-test). Topographic maps (top) show RM distributions across the scalp, with dots indicating electrodes with significant change responses (p < 0.05, two-tailed one-sample t-test, FDR corrected). The asterisk indicators and topographic maps are similar in (f) and (i). df Effect of ICI contrast (Session 1, Experiment 2, n = 42). d Stimulation setup: Five levels of interval difference (4-4, 4-4.01, 4-4.02, 4-4.03, 4-4.06 ms). e Grand-averaged change responses at POz. f Relative RM as a function of ICI contrast. Asterisks (*) indicate significant change responses (p < 0.05, two-tailed one-sample t-test). Topographic maps and significance markers are as in (c). gi Effect of regularity (Session 4, Experiment 2, n = 24). g Stimulation setup: Four levels of ICI standard deviation in irregular combinations (µ/400, µ/200, µ/100, µ/50), with Reg4-4.06 as control (σ = 0). h Grand-average change responses at POz for different ICI variability levels. i Relative RM as a function of ICI standard deviation. Asterisks (*) indicate significant change responses (p < 0.05, two-tailed one-sample t-test); hash symbols (#) indicate significant differences from Reg4-4.06 (p < 0.05, two-tailed paired t-test). Topographic maps and significance markers are as in (c) and (f).

To assess the resolution of temporal integration, we kept the first train’s ICI constant (4 ms) and systematically increased the contrast (difference) between the two click train segments (Session 1 of Experiment 2; Fig. 3d). This was achieved through holding the ICI of train 1 constant (4 ms), whilst systematically modulating the ICI of train 2 so that it was longer than train 1 by a ratio between 0.25% and 1.5% (Fig. 3d). The results showed that a mere 0.5% difference in ICI (Reg4-4.02) was sufficient to elicit significant change responses (t(41) = 4.10, p < 0.001, d = 0.63, one-sample t-test; Fig. 3e), thereby indicating a remarkably high temporal resolution in the process of auditory temporal integration. Larger ICI contrasts were associated with stronger change responses (Fig. 3e, f).

The initial phase of our investigation predominantly focused on click trains characterized by constant ICIs, designated as regular click trains. This prompted a subsequent inquiry into the perceptual implications of varying ICIs within click trains. To quantitatively assess the effect of regularity, we introduced varying degrees of variance to each train, with standard deviations of µ/400, µ/200, µ/100, µ/2, and 0 (where µ denotes the mean ICI, set at either 4 ms or 4.06 ms, and standard deviation of 0 corresponds to Reg4-4.06) (Session 4 of Experiment 2; Fig. 3g). Significant change responses were only observed in 0 (t(23) = 3.61, p < 0.01, d = 0.74, one-sample t-test) and µ/400 conditions (t(23) = 3.16, p < 0.01, d = 0.64, one-sample t-test; Fig. 3h, i), with a negligible difference between the two conditions (i.e., 0 and μ/400; t(23) = 1.76, p = 0.09, d = 0.36, BF10 = 0.82, paired t-test). The responses with µ/200 were significantly weaker than those with 0 standard deviation (i.e., Reg4-4.06) (t(23) = 3.33, p < 0.01, d = 0.68, paired t-test).

Perception performance during transitional click train

We next investigated the perception of temporal merging using behavioral experiments together with EEG recording. The primary objective was to clarify the impact of click regularity on temporal merging (Fig. 1c–e and Fig. 3g–i) and to compare the perception of temporal merging auditory objects and pure tones. For this purpose, we developed an experimental paradigm that juxtaposed three distinct sets of stimuli to assess both the behavioral performance and change responses under various degrees of contrast (Experiment 3; Fig. 4a). The regular condition included transitional click trains transitioning from a regular click train with 4-ms ICI to another ICI (contrast levels: 0, 0.25%, 0.5%, 0.75%, 1.5%). The irregular condition comprised transitional click trains transitioning from an irregular click train (standard deviation: µ/2) with an average ICI of 4 ms to another average ICI (contrast levels: 0, 1.5%, 100%). The tone condition consisted of pure tones shifting from 250 Hz to another frequency (contrast levels: 0, 1.5%). Each block within the three conditions was designed to present a 1 s initial stimulus followed by a 1 s subsequent stimulus, concluding with a 2-s choice window. Participants were required to detect whether an auditory stimulus change had occurred (Fig. 4a).

Fig. 4: Psychological and EEG results during change detection task.
figure 4

a Experimental Setup (Experiment 3, n = 36) outlines three sets of stimuli—regular transitional click train, irregular transitional click train, and frequency changing tone—to evaluate change response and behavioral performance under varying contrasts, featuring a sequence of stimuli presentation followed by a choice window for response. b Boxplots show psychological functions of ratio of change detection (mean: central line; standard error: box edges; 10th-90th percentiles: whiskers) for regular (red), irregular (blue), and tone (purple) stimuli. Transparent dots represent individual participants. c Group comparisons for (b) (*p < 0.05; ns: non-significant, two-tailed paired t-test). Error bars represent \(\pm\) SEM. d, e Scatterplots compare change detection ratios between Reg4-4.06 and Irreg4-4.06 (d), and Reg4-4.06 and Tone250-246 (e), respectively. f, g Grand-averaged change responses at POz show comparisons between Reg4-4.06 and Irreg4-4.06 (f), and Reg4-4.06 and Tone250-246 (g), with yellow vertical bars indicating significantly different time window (p < 0.05, permutation test). Shaded areas represent \(\pm\) SEM. h, i Scatterplots of the relative RM of change responses comparing: Reg4-4.06 vs. Irreg4-4.06 (h), and Reg4-4.06 vs. Tone250-246 (i), where p < 0.001 for paired t-tests. Topographic maps display RM distributions across the scalp under Reg4-4.06 (h and i, top left), Irreg4-4.06 (h, bottom right), and Tone250-246 (i, bottom right), with dots indicating channels with significant change responses (p < 0.05, paired t-test, FDR corrected).

The results showed that the change detection performance progressed with the increase in the difference between the first and the second stimulus (Fig. 4b). Remarkably, a 1.5% contrast difference in the regular condition (Reg4-4.06) led to a detection rate of 98.8% in correctly identifying changes (n = 36; Fig. 4c), in stark contrast to the detection rate of 35.7% observed in the irregular condition (Irreg4-4.06), which did not significantly differ from its control, Irreg4-4 (t(35) = 0.93, p = 0.36, d = 0.16, BF10 = 0.27, paired t-test; Fig. 4c), suggesting no perceptual distinction between the two segments in the irregular transitional click train. The detection rate in the irregular condition reached 87.4% when the contrast was increased by 100% (i.e., Irreg4-8, Fig. 4b). For pure tones, the detection rate was 90.1% when the tone shifted from 250 Hz to 246 Hz (Tone250-246). Note that 250 Hz corresponds to 4 ms, and 246 Hz corresponds to 4.06 ms. The detection rate for Tone250-246 was significantly higher than the control condition (Tone250-250), in which the tone was always 250 Hz (t(35) = 25, p < 0.001, d = 4.17, paired t-test; Fig. 4c), yet lower than that observed for Reg4-4.06 (t(35) = 3.55, p < 0.01, d = 0.59, paired t-test; Fig. 4c, e). Furthermore, subject-by-subject comparisons revealed most subjects had higher detection rate for Reg4-4.06 than for Irreg4-4.06 (t(35) = 11.59, p < 0.001, d = 1.93, paired t-test; Fig. 4d) and Tone250-246 (Fig. 4e). In summary, these findings emphasize the enhanced performance in the regular condition in identifying contrast changes compared to both the irregular condition and the pure tone condition.

For the EEG change responses in the three conditions, Reg4-4.06 evoked stronger change responses compared to Irreg4-4.06 (Fig. 4f) and Tone250-246 (Fig. 4g). Actually, no significant change response was observed in Irreg4-4.06 (Fig. 4f). Individual results also indicated that most subjects demonstrate stronger changes responses for Reg4-4.06 than for Irreg4-4.06 (t(35) = 6.86, p < 0.001, d = 1.14, paired t-test; Fig. 4h) and Tone250-246 (t(35) = 4.41, p < 0.001, d = 0.73, paired t-test; Fig. 4i), which is consistent with the behavior results (Fig. 4d and Fig. 4e). Interestingly, the variation in change response amplitude was correlated with decision-making in the more difficult condition, such as Reg4-4.01. In Reg4-4.01, the decision to detect a change was typically accompanied by a stronger change response compared to the decision of no change in the sound (Supplementary Fig. 4).

The effect of temporal continuity

To investigate the impact of temporal continuity on the change responses, we designed two sets of stimuli: one set without gaps of silence between click train 1 and click train 2 (No-gap) and the other set with a gap of 600-ms silence between the two click trains (Gap) (Experiment 4; Fig. 5a). Four transitional click trains were used: Reg4-4.01, Reg4-4.02, Reg4-4.03, and Reg4-4.06. Participants were asked to detect whether an auditory stimulus change had occurred. The behavioral performance was better for the No-gap click trains than for the Gap click trains (n = 34; Fig. 5b), with most participants showing this pattern (t(33) = 6.82, p < 0.001, d = 1.17, paired t-test; Fig. 5c).

Fig. 5: The effect of temporal continuity on psychological and EEG responses during change detection task.
figure 5

a Experimental setup with two stimulus sets: the transitional train (No-gap) with varying contrasts (Experiment 3: Reg4-4, Reg4-4.01, Reg4-4.02, Reg4-4.03, Reg4-4.06), and a similar set that includes a 600 ms gap between click trains (Experiment 4). Participants identified auditory changes by pressing designated buttons. b Boxplots show psychological functions of ratio of change detection (mean: central line; standard error: box edges; 10th-90th percentiles: whiskers) for No-gap (red) and Gap (blue) tasks. Transparent dots represent individual participants (n = 34). c Scatterplots of behavioral thresholds in the two behavioral conditions (p < 0.001, two-tailed paired t-test). d Grand-averaged change responses to different ICI contrast at POz in No-gap condition. Shaded areas represent \(\pm\) SEM. e Grand-averaged responses at POz time-locked to the onset of the second sound under different stimuli in Gap condition. f Boxplots show the average P1/cP1 response as a function of ICI contrast for change responses (mean: central line; standard error: box edges; 10th−90th percentiles: whiskers) under No-gap (cP1 response, red) and Gap (P1 response, blue) conditions. Transparent dots represent individual participants. g The average N2/cN2 response as a function of ICI contrast under the two conditions.

Subsequently, we examined the EEG responses, superimposing all contrast conditions for both the No-gap (Fig. 5d) and Gap (Fig. 5e) stimuli. The responses to varying contrast conditions were distinguishable in the No-gap condition (Fig. 5d) but nearly indistinguishable in the Gap condition (Fig. 5e). Two components, P1 and N2, were identified from the onset response of the second click train, as shown in Fig. 5e. We plotted the tuning curves for P1 ([70, 120] ms) and N2 ([133, 183] ms), compared with cP1 ([90, 140] ms) and cN2 ([221, 271] ms). As the contrast increased, the average response magnitude (RM) also increased, displaying clear tuning in the No-gap condition for both peak window (F(3,132) = 6.46, p < 0.001, η2 = 0.13, one-way ANOVA; red line in Fig. 5f) and trough window (F(3,132) = 19.52, p < 0.001, η2 = 0.31, one-way ANOVA; red line in Fig. 5g), while remaining nearly constant in the Gap condition for both windows (P1/cP1 window: F(3,132) = 1.38, p = 0.25, η2 = 0.03, BF10 = 0.07; N2/cN2 window: F(3,132) = 2.78, p = 0.04, η2 = 0.06, BF10 = 0.43, one-way ANOVA test; blue line in Fig. 5f, g). These EEG findings align with the psychological results regarding thresholds (Fig. 5c), emphasizing the role of temporal continuity in both psychological perception and neural processing.

Potential clinical application

Considering the fundamental role of temporal integration in the brain24,25 and its relevance to many psychiatric diseases19,26,27,28,29, the change response serves as a promising tool for diagnosis. To explore the potential for clinical application of this paradigm, we conducted 64-channel EEG recordings in 22 coma subjects using transitional click trains stimuli: Reg4-4 and Reg4-5 (Experiment 5).

For coma patients, both onset and change responses were small, even in Reg4-5, contrasting with the healthy subjects (Supplementary Fig. 5a). The scatter plots of onset vs. change responses showed significant overlap between coma subjects and healthy subjects. This overlap was probably due to the presence of slow oscillations with larger amplitudes localized to specific channels, along with the prolonged latency and extended duration of the auditory response in coma subjects (Supplementary Fig. 6a–c). To quantify change responses in subjects with impaired consciousness, global field power (GFP) was employed due to its robustness to spatial variability and enhanced sensitivity to response latency30,31. GFP calculates the standard deviation of EEG data across all electrodes at each sampling point, thereby mitigating the influence of spatial variability in electrode placement or individual differences in brain anatomy. This is crucial when studying subjects with impaired consciousness, where localized differences in brain function might occur due to injury or pathology. A robust onset response was detected in one example coma subject using GFP (Supplementary Fig. 6c), although no visible onset response was detected in amplitude (Supplementary Fig. 6b). However, no change response was detected even using GFP (Supplementary Fig. 6c). In the population, no visible onset or change responses were detected using GFP in coma patients, whereas significant robust responses were observed in healthy subjects (Fig. 6a).

Fig. 6: The effect of consciousness on change responses in transitional train.
figure 6

a Averaged GFP of coma participants from Experiment 5 (blue line, n = 22) and healthy participants (red line, n = 24) under Reg4-5. Shaded areas represent \(\pm\) SEM. b Scatterplots of GFP indices of change responses and onset responses for coma participants (blue) and healthy participants (red). The magenta circles indicate one subject (#041802) before (hollow) and after (filled) recovery.

Furthermore, the scatter plot of onset vs. change responses in GFP effectively separated coma patients from healthy subjects, suggesting the transitional click train paradigm as a good tool for distinguishing between the two groups (Fig. 6b). More interestingly, the change response may gradually recover as the coma patient regains consciousness (Supplementary Fig. 7), indicating that the transitional click train paradigm could potentially monitor the entire recovery process of coma patients. However, no correlation was found between the CRS-R score, a standard method for quantifying the degree of coma, and either onset or change responses (Supplementary Fig. 8).

Discussion

Our study meticulously examined the mechanisms of temporal merging within auditory perception, elucidating how the human auditory system assimilates discrete sound elements into unified auditory objects. With temporal merging, a click train with minimal ICIs gives a distinct auditory experience. Specifically, regular click trains (Reg4-4.06) prompted more pronounced change responses in the auditory brain than irregular click trains (Irreg4-4.06), highlighting the significant impact of temporal regularity on auditory processing (Figs. 1 and 4). Further analysis demonstrated that the change response is intricately tied to the integration of multiple intervals, suggesting it as a marker for the perceptual transition between distinct auditory objects via temporal merging (Fig. 2). This response is notably affected by several factors: the length of ICI (Fig. 3a–c), the ICI ratio (IC12 vs. ICI1) (Fig. 3d–f), and the regularity of the click train (Fig. 3g–i). Additionally, behavioral experiments showed enhanced change detection rates for regular click trains (Reg4-4.06) compared to irregular click trains and pure tones, corroborated by stronger EEG change responses (Fig. 4). Temporal continuity significantly affected behavioral and EEG responses, with better performance and clear tuning curves for continuous click trains compared to those with gaps (Fig. 5). Finally, the GFP method effectively distinguished coma patients from healthy subjects, suggesting the potential clinical application of transitional click trains for diagnosing and monitoring recovery in impaired consciousness (Fig. 6).

Change response in transitional click train as an indicator of temporal integration

Click trains with ICIs less than ~33 ms are often perceived as pitch32, and it has been suggested that the analysis of regularity in click trains differs for ICIs above and below 40–60 ms33. Traditional theories differentiate pitch perception based on the auditory system’s ability to segregate individual harmonic components, categorizing pitch as either resolved or unresolved harmonics27,28. Resolved harmonics arise from distinct components processed by separate auditory filters, whereas unresolved harmonics involve closely spaced components within a single filter, relying on temporal coding for pitch extraction. Interestingly, sounds with the same repetition rate but highly different spectral compositions often evoke the same pitch, whereas sounds with similar spectra can produce significantly different pitch percepts27,28. This demonstrates that the frequency-to-place mapping performed by the cochlea does not necessarily correspond to a frequency-to-pitch mapping29. The temporal pitch induced by click trains is distinct because it relies solely on the temporal regularity of successive auditory events rather than spectral components29,30,31,23. However, the neuronal mechanisms underlying temporal pitch perception remain unresolved, and the relationship between temporal pitch and temporal integration has not been clearly established. Our research provides direct evidence linking temporal pitch perception to temporal integration.

Auditory research utilizing click trains as stimuli has unveiled intricate neuronal responses in the auditory system on both single-neuron and systems neuroscience levels. On the single-neuron level, neurons display a remarkable capability for precise temporal coding where individual spike activities precisely align with specific intervals between the clicks34. Despite the prominence of this temporal alignment, rate coding emerges as another vital mechanism, particularly at accelerated click rates35. Lu et al.10 identified two distinct populations of neurons: one that synchronizes to slow sound sequences and another that encodes rapid events through firing rates. However, these studies mainly focused on how individual clicks within a train are represented, largely overlooking the holistic perception of the click train as a coherent object36. At the macroscopic level, click trains have been extensively used to study auditory steady-state responses (ASSR)37,38,39, where the neural response follows the same frequency of auditory stimuli, and the auditory response can be disrupted by an additional click12. These studies, similar to those at the single-neuron level, concentrate on responses to individual clicks, leaving the mechanism of how the brain integrates regular clicks into pitch perception unresolved. Recently, the holistic representation of sound has been investigated in the frequency domain, and researchers have found that auditory cortex (AC) neurons may exhibit bursting responses specifically to the configuration of tones but not to any constituent tone40,41. However, the holistic representation in the temporal domain, especially for sound through temporal integration, has been seldom addressed. This gap exists because disentangling neural responses to individual clicks from those induced by the holistic perception of the whole click train as pitch poses a significant challenge. Consequently, no brain signal has yet been identified that adequately represents auditory events through temporal merging with click trains in prior research10,34,35, highlighting a crucial area for future investigation at both single-neuron and macroscopic levels.

To navigate this intricacy, we propose the innovative concept of a transitional train, as illustrated in Fig. 1. A typical onset response to transitional trains was observed in the first 300 ms of EEG signals, followed by an adaptation period from ~300 to 1000 ms, during which no discernible auditory response to individual clicks or the train was detected (Supplementary Fig. 2a). However, the introduction of a second click train with a slightly changed ICI (e.g., Reg4-4.06) within a transitional train elicited a change response in the adapted auditory brain, followed by subsequent adaptation (Fig. 1d). Moreover, the transitional train also introduced a perceptual switch psychologically (Fig. 4). Since this change response in the EEG signal is not solely attributed to local temporal changes but is linked to temporal merging (Fig. 2), it most likely reflects a perceptual switch, signifying a transition between distinct temporal-merging auditory objects (Fig. 4). The key aspect underlying the transitional click train is that it maintains the presentation of individual clicks, which leads to consistent adaptation of the auditory brain (Fig. 1), while simultaneously introducing a perceptual switch (Fig. 4).

The critical innovation of our paradigm lies in maintaining a regular stream of clicks, ensuring ongoing cortical adaptation, while subtly shifting the temporal pattern to elicit a new percept without reintroducing stimulus onset artifacts. This provides a unique framework to examine the neural basis of short-timescale temporal integration—a process distinct from higher-order regularity detection. We acknowledge prior work by Barascud et al. (2016)42, who used tone-pip sequences to explore the brain’s sensitivity to transitions between random and regular acoustic patterns. While conceptually related in probing sensitivity to structure, their paradigm focused on complex, longer-timescale regularities, and statistical learning. In contrast, our transitional click train design targets integration at the millisecond scale, isolating the mechanisms by which the brain fuses discrete auditory events into a single perceptual object—a function central to pitch-like perception. Importantly, we emphasize that the change response identified here is not a direct neural correlate of pitch itself, but rather a marker of temporal integration—indicating when the brain transitions from one temporally fused percept to another. This distinction clarifies the functional role of the change response: it signals a perceptual shift between temporally structured auditory objects, rather than encoding the pitch per se.

Three key factors influence the change response: the length of the ICI (Fig. 3a–c), the ICI ratio (ICI₂ vs. ICI₁) (Fig. 3d–f), and the regularity of the click train (Fig. 3g–i). Our research found that when the ICI length exceeds 16 ms, the auditory system is unable to elicit the change response, potentially suggesting a limit of temporal merging in ICI. This threshold is notably lower than what psychological studies have suggested, where the perception of a unified sound begins to falter at ICIs >29.6 ms23 (Supplementary Fig. 1), and it is also below the 33 ms threshold often associated with pitch perception32. This discrepancy might be attributed to the inherent limitations of EEG recordings, which typically have a poor signal-to-noise ratio, underscoring the need for further investigation into the ICI threshold for temporal merging using more sophisticated methodologies. The auditory brain exhibits hypersensitivity to ICI ratios; even a 0.5% difference (Reg4–4.02) can evoke a robust change response (Fig. 3d–f). Additionally, the regularity of the click train, which not only characterizes the temporal structure but also requires extended time for integration to extract the train’s regularity, reflects context-dependent temporal merging (Fig. 3g–i).

The transitional click train paradigm presents significant opportunities for fundamental research in auditory science. Traditional auditory research has predominantly concentrated on the frequency domain, guided by the auditory system’s tonotopic organization, where distinct frequencies are processed separately along the auditory pathway1. Nevertheless, the importance of the temporal dimension in auditory processing cannot not be stressed enough. This temporal aspect is critical for speech and music perception, as well as for distinguishing environmental sounds3. Recent advances in neuroimaging and electrophysiology have enhanced our understanding of temporal integration mechanisms in oral language, revealing a hierarchical structure of temporal integration in the human brain43,44. However, there remains a significant gap in our understanding of temporal integration in non-human animals, primarily due to the lack of a clear neuronal signature for this process, which has impeded research at the neuronal level and in animal studies. The identification of a change response in transitional click trains in our study provides a promising pathway to investigate this complex area further. Future research could employ the transitional click train paradigm to delve into the neuronal mechanisms underpinning temporal integration at the neuronal level in animal subjects.

In addition to providing neural signals related to temporal integration, our study elucidates the neuronal mechanisms underlying pitch perception evoked by click trains. Our findings highlight the role of temporal integration as a key process in pitch perception. Traditional theories distinguish between resolved and unresolved harmonics based on the auditory system’s ability to segregate individual harmonic components45,46. Resolved harmonics arise from distinct components processed by separate auditory filters, while unresolved harmonics involve closely spaced components processed by a single filter, relying on temporal coding for pitch extraction. Interestingly, sounds with the same repetition rate but very different spectral compositions often evoke the same pitch, while sounds with similar spectra can produce significantly different pitches. This observation demonstrates that the frequency-to-place mapping performed by the cochlea does not necessarily correspond to a frequency-to-pitch mapping47. Temporal pitch, induced by click trains, is distinct in that it relies solely on the temporal regularity of successive auditory events rather than on the spectral components47,48,49,50. Our study provides compelling neuronal evidence supporting this process, demonstrating that the change response reflects the integration of temporal information into a unified auditory pitch (Fig. 2). Previous research has used paradigms similar to transitional click trains to investigate temporal pitch sensitivity51 and observed the change response in EEG signals of cats50. Our insertion experiments further explored the nature of the change response, with a focus on temporal integration (Fig. 2).

Behavior relevance during transitional click train

The alignment between psychological findings and EEG data underscores a notable facet of our research. On the one hand, our psychological data reveal the heightened sensitivity of regular transitional click trains compared to both pure tones (Fig. 4e, g, i) and irregular click trains (Fig. 4d, f, h). Concurrently, EEG signals exhibit stronger responses to regular click trains (Fig. 4f–i). Regular click trains, especially those with shorter ICIs, are often perceived to have pitch-like qualities; thus, change detection in regular click trains may reflect a perceptual shift in pitch rather than a simple temporal discontinuity. Traditionally, this pitch perception has been explained through theories and computational models focusing on the basilar membrane’s processing13. The pronounced sensitivity of regular click trains over pure tones underscores the critical role of temporal integration in the central nervous system for refining fine temporal structures. This suggests that the pitch perception associated with regular click trains might originate in the central auditory system rather than the basilar membrane. This hypothesis necessitates further exploration, particularly employing our innovative transitional click train in animal studies.

Additionally, both psychological and EEG responses demonstrate a dependency on temporal continuity. The introduction of a 600 ms gap adversely affects change detection capabilities (Fig. 5b, c) and alters the tuning of change differences (Fig. 5f, g). Given that the change response to the transitional click train systematically correlates with the contrast ratio (Fig. 5f, g), whereas responses to the second sound in the Gap condition remain relatively constant across different contrast ratios, it suggests that the change response primarily signifies the signal of perceptual switching between pitches rather than the perception of the second pitch. The influence of temporal discontinuity on both behavioral and neural responses accentuates the essential role of temporal integration within the auditory system, suggesting that seamless auditory perception relies on the continuous flow of temporal information.

Change response as a biomarker in clinical application

Three key factors influence the change response (Fig. 3) and are consequently related to temporal merging, offering diverse metrics for characterizing temporal integration, and potentially serving as valuable tools in clinical applications: the length of the ICI, the difference between ICIs, and the regularity in the click train. These factors hold promise as potential biomarkers for mental disorders. To further explore this possibility, we investigated the coma patients (Fig. 6). The change response dramatically vanished (Fig. 6a), even in some cases, the onset response exists while no visible change response (Supplementary Fig. 6c), suggesting that the signal may reflect a state-dependent marker of consciousness. We acknowledge that our current sample size is limited, and these findings alone do not establish the change response as a definitive diagnostic or prognostic indicator. However, they point toward its potential as a neural correlate of awareness. Future studies with larger cohorts and longitudinal tracking will be necessary to determine whether this EEG marker can reliably predict recovery trajectories in patients with impaired consciousness. Beyond disorders of consciousness, we propose the use of transitional trains for the assessment of psychiatric conditions. Interestingly, the change response may recover as the patient get recovery (Supplementary Fig. 7). The observed change response to transitional train provides an innovative pathway for refining coma monitoring techniques. Further extending the clinical applicability of our research, we propose the use of transitional trains for the assessment of psychiatric conditions. As temporal integration, a central component of brain functionality24,25, has been found to be compromised in conditions like schizophrenia26, autism spectrum disorders19,27, attention deficit hyperactivity disorder28, and Parkinson’s Disease29. Given these findings, the signal of temporal integration might be poised to emerge as a pivotal biomarker for broader clinical diagnostics.

Materials and methods

Experimental procedure and participants

The study comprised four experiments, all conducted in accordance with the Declaration of Helsinki (2013)52. Experiment 1 involved a total of 22 participants (14 males and 8 females, mean age: 29.36, standard deviation: 3.19). A total of 42 participants (20 males and 22 females, mean age: 23.36 years old, standard deviation: 2.55) participated in Experiment 2 (Sessions 1 and 3), Experiment 3, and Experiment 4. A total of 24 participants (14 males and 10 females, mean age: 24.87 years old, standard deviation: 7.43) participated in Sessions 2 and 4 of Experiment 2, who also served as the healthy control in comparison with coma patients in Experiment 5. Participants maintained a stationary head position while listening to auditory stimuli and responding via keyboard presses. Experiments including healthy participants were approved by the Institutional Review Board (IRB-20230131-R), and informed consent was obtained from all participants. Experiment 5 involved 22 coma participants with impaired consciousness (16 males and 6 females, mean age: 56.52 years, standard deviation: 15.96), including one participant who was recorded again after recovery from coma. The level of consciousness in coma participants was measured using the Coma Recovery Scale-Revised (CRS-R) scores. This experiment was approved by Natural Science Foundation of Zhejiang Provincial (LGF22H170006). All ethical regulations relevant to human research participants were followed. Each stimulus was repeated a minimum of 40 times to each participant across all experimental sessions.

Experiment 1

This was a gap detection task (Supplementary Fig. 1a) involving click trains of 1024 ms duration with varying ICIs (4, 8, 16, 32, 64, 128, 256 ms). Participants were positioned in a chair facing a keyboard and speaker. After each click train, a 100 ms 1000 Hz cue (100 ms) was presented 800 ms after the end of the click train. Participants were instructed to press the right key on the keyboard if a gap was detected in the click train and the left key if the click train was perceived as continuous. Keyboard press was valid within 700 ms after the cue onset. Experiment 1 aimed to determine the upper psychological threshold for perceiving temporally integrated click trains. We found that when the inter-click interval (ICI) exceeded 29.6 ms, participants began to perceive individual clicks rather than a unified auditory object. This threshold, together with neuronal data from previous studies10,15, was used to constrain the ICI values employed in Experiments 2, 3, and 4. This approach ensured that the tested intervals fell within the perceptual range for temporal integration, as well as within the adaptation range for eliciting neuronal responses to each click.

Experiment 2

This included four passive listening sessions. Experiment 2 was designed to systematically examine the parameters influencing the change response and to elucidate the underlying nature of this neural response.

  • Session 1 (ICI contrast): This session consisted of five regular transitional trains (Reg4-4, Reg4-4.01, Reg4-4.02, Reg4-4.03, and Reg4-4.06) and two irregular transitional trains (Irreg4-4 and Irreg4-4.06), and a tone-pair (Tone250-246). In Fig. 1 (showing Reg4-4.06 and Irreg4-4.06) and Supplementary Fig. 2 (displaying Reg4-4, Reg4-4.06, Irreg4-4, and Irreg4-4.06), we focus on four transitional click trains to introduce the paradigm and illustrate the fundamental neural responses evoked by both regular and irregular stimuli. Analyses presented in Fig. 3d–f encompass all five regular transitional trains from this session, providing a detailed assessment of neural sensitivity to subtle differences in inter-click interval.

  • Session 2 (Insertion experiment): 1, 2, 4, 8, 16, and 32 intervals (each with an interval of 4.06 ms, not silent) were inserted into a click train with a 4 ms ICI (Fig. 2a). The inserted intervals were indeed slightly lengthened ICIs rather than silent gaps, preserving acoustic stimulation while altering temporal pattern. Controls included Reg4-4 and Reg4-4.06.

  • Session 3 (ICI length): Four regular transitional trains (Fig. 3a) were presented (Reg4-4.06, Reg8-8.12, Reg16-16.24, and Reg32-32.48), which altered ICI of train 1 but maintained a fixed ratio of ICI between train 1 and train 2.

  • Session 4 (ICI variance): Four Irreg4-4.06 transition trains with different standard deviations (σ = µ/400, µ/200, µ/100, and µ/2, where σ represents the standard deviation and μ represents the mean value of ICI) and one Reg4-4.06 (σ = 0) were randomly presented (Fig. 3g).

Experiment 3

was a change detection task, aiming to investigate both the perceptual and neural correlates of temporal integration, with particular emphasis on how rhythmic regularity and ICI contrast contribute to the formation and detection of integrated auditory objects. Participants were required to report whether the sound changed by pressing two designated keys within 2 s after the end of the stimuli, with left key representing change in the transitional stimulation and right key for no change (Fig. 4a). This experiment included five regular transitional trains (Reg4-4, Reg4-4.01, Reg4-4.02, Reg4-4.03, and Reg4-4.06) and three irregular transitional trains (Irreg4-4, Irreg4-4.06, and Irreg4-8), and two tone-pairs (Tone250-250 and Tone250-246, with 5-ms rise-fall edges).

Experiment 4

(Gap effect) was also a behavioral experiment on the effect of temporal gaps. A 600-ms gap between click train 1 and click train 2 resulted in Gap transitional trains (Fig. 5a). Participants identified if the two click trains were different and reported following the same rule in Experiment 3.

Experiment 5

involved a passive listening session in which coma patients were presented with two transitional click trains (Reg4-4 and Reg4-5; Fig. 6). For healthy controls, the same two transitional trains were presented in a corresponding session, serving as a direct comparison group for Experiment 5 (Fig. 6).

Auditory stimuli

The experiments were conducted within a sound-proof room. A single click consisted of a 0.2-ms pulse. Click trains were categorized as either regular, with a fixed ICI, or irregular, with random ICIs. For irregular click trains, the ICIs were randomized using a Gaussian distribution, and satisfied the following formula:

$$0.3\times {\mu }_{i}\le {I}_{i,j}\le 1.7\times {\mu }_{i} \\ 1.2\times {\mu }_{i}\le {\sum }_{n=j}^{j+1}{I}_{i,n}\le 3.1\times {\mu }_{i} \\ 1.8\times {\mu }_{i}\le {\sum }_{n=j}^{j+2}{I}_{i,n}\le 4.6\times {\mu }_{i}$$

where \({\mu }_{i}\) represents the average ICI (in milliseconds) of the \(i\)th train in the transitional click train and \({I}_{i,j}\) is the \(j\)th ICI in the \(i\)th train. The mean value of the Gaussian distribution matched the fixed ICI of a regular click train, while the standard deviation was a certain percentage of the fixed ICI (0.25%, 0.5%, 1%, or 50%). A transitional click train is formed by concatenating two click trains. For example, a Reg4-4.06 transitional train denotes the combination of two regular click trains: regular click train 1 (with an ICI of 4 ms) is seamlessly followed by regular click train 2 (with an ICI of 4.06 ms). Similarly, an irregular transitional train is composed of two irregular click trains with the given average ICIs. For continuous click trains that seamlessly transitioned from ICI1 to ICI2, the transition time was defined as the onset time of the first click after the first ICI2 interval (Fig. 1b). The ratio of ICI1 to ICI2 quantified the difference level between the two ICIs. Auditory stimuli were delivered through the Golden Field M23 sound player, driven by a Creative AE-7 Sound Blaster, with a sampling rate of 384 kHz. Sound delivery was controlled with Psychtoolbox 3 in MATLAB. Sound intensity was calibrated to maintain a constant level of 60 dB SPL (sound pressure level), using a ¼-inch condenser microphone (Brüel & Kjær 4954, Nærum, Denmark) and a PHOTON/RT analyzer (Brüel & Kjær, Nærum, Denmark).

Data acquisition

Electroencephalogram (EEG) data of Experiment 2 (Sessions 1 and 3) and Experiments 3 and 4 were acquired using a 64-channel NeuroScan system (Compumedics, Australia). EEG data of Experiments 2 (Sessions 2 and 4) and Experiment 5 were acquired using a 64-channel NeuSenW system (Neuracle, China). The EEG cap schematics are depicted in Supplementary Fig. 9. In practice, we only used 59 electrodes for NeuSenW system recordings and 60 electrodes for NeuroScan system recordings. The ground electrodes for both systems were placed between Fpz and Fz in the frontal area. The reference electrode of the 64-channel NeuSenW wireless EEG cap was positioned between Cz and Pz, replacing CPz, while the reference electrode of the 64-channel NeuroScan Quick-Cap was placed between Cz and CPz. The EEG data were sampled at 1 kHz, and electrode placement followed the international 10-20 system protocol.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

Data analysis

Preprocessing

The data analyses were performed using MATLAB R2021b (MathWorks) and the Fieldtrip toolbox53. Monopolar referencing was employed in this study, using the default single reference electrode of the EEG cap (Supplementary Fig. 9). The multichannel EEG data underwent several preprocessing steps. First, the full EEG data were filtered using a band-pass filter in the frequency range of 0.5 to 40 Hz. Then, 4-s epochs were obtained, spanning from -1 to 3 s relative to trial onset. Independent component analysis (ICA) was then applied to the epochs to remove electrooculogram (EOG). After ICA, baseline correction was applied by subtracting the mean response within the baseline window from -200 to 0 ms relative to the onset of train stimulation for each trial. Following this, a relative threshold was used to evaluate motion artifacts for each trial, excluding those exceeding predefined thresholds. The relative threshold was determined based on the percentage of bad samples within a trial. A sample was flagged as a “bad sample” if it fell outside the range of “Mean”±3 × “SD” for a trial across specific channels. Trials with over 20% of bad samples were labeled as “bad trials”. Additionally, channels with over 10% of bad trials were labeled as “bad channels”. Bad channels were initially excluded by assessing bad samples using data from all channels, and subsequently, bad trials were excluded from all channels by computing bad samples using data from the remaining good channels. Finally, the event-related potential (ERP) data were obtained by averaging the epoch data for each experimental condition, channel, and subject. Prior to applying inter-subject analysis, ERP data were normalized by each channel’s standard deviation per subject to reduce inter-channel variability.

Permutation test

For time-sample level comparisons of ERP or global field power (GFP) between two conditions, a two-tailed cluster-based permutation test was conducted using the ‘ft_timelockstatistics’ function from the FieldTrip toolbox in MATLAB. The procedure was as follows: 1) A two-tailed independent samples t-test was performed at each time point and channel to compute the t-value comparing Dataset A and B. 2) A cluster-defining threshold (p < 0.05) was applied to identify significant time points. Time points exceeding this threshold were marked as “candidate significant points”. Clusters were then formed by grouping consecutive significant time points and neighboring significant electrodes (for GFP comparison, only in the time dimension), using a Statistical Parametric Mapping (SPM) labeling algorithm54. 3) The condition labels for Dataset A and B were randomly permuted. For each permutation, a new t-value matrix was calculated, and clusters were re-formed using the same threshold. The largest cluster-level statistic (e.g., sum of t-values) from each permutation was retained, forming a null distribution. 4) Finally, the observed clusters from step 2 were compared to the null distribution. A cluster was deemed significant if its cluster-level statistic was >95% of permuted clusters (p < 0.05). This permutation test was conducted at the inter-subject level within [-200, 600] ms relative to change point. For example, Dataset A represented ERP/GFP data under Reg4-4.06 from 42 subjects, while Dataset B represented ERP/GFP data under Reg4-4 from the same subjects.

Quantification of change response

The change response comprised two major ERP components, cP1 and cN2 (Supplementary Fig. 2c and Fig. 1d), which exhibited opposite polarities in the frontal and temporal-parietal-occipital scalp regions (Supplementary Fig. 2a). We identified these EEG components based on peak detection in the GFP averaged across subjects. For instance, the first peak (118 ms) of the GFP following the change point of Reg4-4.06 was identified as the cP1 component, with the second peak (198 ms) as the cN2 component (red curve, Supplementary Fig. 2c). The P1 component of the onset response was peaked at 95 ms after the onset of the click train with 4-ms ICI, and the N2 component was peaked at 158 ms. To quantify individual components (e.g., cP1 of Reg4-4.06 in Supplementary Fig. 2, and P1/cP1/N2/cN2 in Fig. 5f, g) for each electrode, we calculated the mean ERP value within a [-25, 25] ms time window centered around the peak time identified from GFP, subtracting the baseline mean ([-200, 0] ms relative to change point or onset). For a global quantification of the entire change or onset response across the scalp, we used the relative response magnitude (RM), defined as the root mean square (RMS) of the ERP over a specific time window encompassing both peak and trough responses, with the baseline RMS subtracted. The RM calculation proceeded as follows:

$${{\rm{R}}}{{{\rm{M}}}}_{i}\left[{t}_{1},{t}_{2}\right]=\sqrt{\frac{\mathop{\sum }_{t={t}_{1}}^{{t}_{2}}{\left[{r}_{i}\left(t\right)\right]}^{2}}{n}}$$

where \({r}_{i}\left(t\right)\) represents the ERP of the \({i}^{{th}}\) channel at time \(t\), \(n\) represents the number of samples within the time window from \({t}_{1}\) and \({t}_{2}\), and \({{\rm{R}}}{{{\rm{M}}}}_{i}\) is the response magnitude of channel \(i\). The relative RM was defined as:

$${{\rm{R}}}{{{\rm{M}}}}_{i}={{\rm{R}}}{{{\rm{M}}}}_{i}\left[{t}_{1},{t}_{2}\right]-{{\rm{R}}}{{{\rm{M}}}}_{i}\left[-200,0\right]$$

The 200-ms baseline before change represented the steady state response of click train 1. The analysis time window for the change response was determined using a two-tailed cluster-based permutation test of the GFP data between the transitional click trains Reg4-4.06 and Reg4-4 at the time-sample level (Supplementary Fig. 2c). Specifically, the change response time windows were [74, 251] ms relative to change point for all passive listening sessions. The relative RMs of all channels were then averaged for each subject to facilitate comparison across experimental conditions (e.g., tunings and scatterplots).

To quantify change responses in subjects with impaired consciousness, GFP was employed due to its robustness to spatial variability and enhanced sensitivity to response latency. GFP calculates the differences in potential across all electrodes at each sampling point, thereby mitigating the influence of spatial variability in electrode placement or individual differences in brain anatomy. The calculation of GFP follows:

$${{\rm{GFP}}}\left(t\right)=\sqrt{\frac{\mathop{\sum }_{i=1}^{N}{\left[{r}_{i}\left(t\right)-\bar{r}\left(t\right)\right]}^{2}}{N}}$$

where \(N\) is the total number of channels, \({r}_{i}\left(t\right)\,\) represents the ERP of the \({i}^{{th}}\) channel at time \(t\), and \(\bar{r}\left(t\right)\) denotes the averaged response across all channels at time \(t\). The difference between the maximum GFP value detected within 300 ms after the change point of the train stimulation and the mean GFP value across the 200-ms baseline response before change was used as an indicator of the change response.

Psychological threshold

The ratio of change detection for each group was calculated by dividing the number of trials in which the subject pressed the left arrow key (indicating change detection) by the total number of trials in that group. Subjects with a ratio of change detection exceeding 0.3 in the control group or less than 0.6 in the Reg4-4.06 group were excluded from behavior-related analyses, including both No-gap and Gap conditions. In total, 36 subjects (out of 42 who participated in Experiments 3 and 4) were included in the stand-alone analyses of the No-gap change detection task. In the comparison of the Gap and No-gap change detection tasks, only the intersection of subjects (n = 34) was included. Psychometric functions were fitted to data using a cumulative Gaussian function55,56:

$$p\left(r\right)=\frac{1}{\sigma \sqrt{2{{\rm{\pi}}}}}{\int}^{r}_{-\infty}{e}^{-\frac{{\left(x-\mu \right)}^{2}}{2{{{\rm{\sigma}}}}^{2}}}{dx}$$

where \(p\left(r\right)\) represents the ratio of change detection as a function of ICI \(r\). \(\mu\) is the Gaussian mean, and \(\sigma\) is the standard deviation (SD). The psychological threshold of change detection was defined as 0.6 of the Gaussian fit (Fig. 5c). This curve fitting procedure was achieved using ‘psignifit’ software package (see http://bootstrap-software.org/psignifit/) for MATLAB. Similarly, the psychological threshold of gap detection (0.4 of the Gaussian fit, Supplementary Fig. 1b) was obtained with the same fitting procedure.

Assessment of consciousness

The level of consciousness in coma participants was assessed using the Coma Recovery Scale–Revised (CRS-R)57, a standardized behavioral assessment tool specifically designed to differentiate among disorders of consciousness (DoC), including coma, vegetative state/unresponsive wakefulness syndrome (VS/UWS), and minimally conscious state (MCS). The CRS-R comprises six subscales evaluating auditory, visual, motor, oromotor/verbal, communication, and arousal functions. Each subscale includes hierarchically arranged items, with scores reflecting the presence or absence of specific behavioral responses. The total CRS-R score ranges from 0 to 23, with higher scores indicating greater levels of behavioral responsiveness and awareness. Assessments were conducted by trained clinicians under standardized conditions to ensure reliability. The CRS-R score for each coma participant was obtained just before the EEG recording. In total, 20 out of 22 coma participants completed the CRS-R assessment.

Source reconstruction

EEG source reconstruction was performed to compare the neural generators of change responses between Reg4-4 and Reg4-4.06 within the [0, 300] ms window following the change point (Session 1, Experiment 2). Covariance matrices were computed from ERP data for each condition, using data from all 42 subjects. A standard boundary element method (BEM) head model (standard_bem.mat) and a standard MRI volume (single_subj_T1_1mm.nii)—both provided in the FieldTrip template directory—were used to model the head and brain anatomy. The MRI and head model were aligned to the MNI coordinate system and resliced with a spatial resolution of 1 mm. Electrode positions were based on the standard 10-20 system (NeuroScan 64-channel Quick-Cap), with a total of 60 electrodes included after excluding M1, M2, CB1, and CB2. A regular three-dimensional source grid with 10 mm resolution was constructed in MNI space. Regions of interest (ROIs) were selected based on the AAL atlas (ROI_MNI_V4.nii), excluding brainstem and most cerebellar areas. The leadfield matrix was computed using the electrode configuration and the BEM head model, with normalization applied. Source reconstruction was performed using exact low-resolution electromagnetic tomography (eLORETA), based on the ERP-derived covariance and thee precomputed leadfield. This procedure was repeated for each subject and condition.

To statistically locate the source of change responses, voxel-wise paired two-tailed t-tests were performed across subjects, with significance assessed using a cluster-based permutation test corrected at p < 0.05. The relative change in source power between conditions was quantified using the normalized difference:

$$\Delta {S}_{{norm}}=\frac{{S}_{2}-{S}_{1}}{{S}_{1}+{S}_{2}}$$

where \({S}_{1}\) and \({S}_{2}\) denote the average source power for Reg4-4 and Reg4-4.06, respectively.

Statistics and reproductivity

All statistical analyses were performed using custom scripts in MATLAB R2021b (MathWorks). Data are presented as mean \(\pm \,\) standard error of the mean (SEM), unless otherwise stated. All statistical tests used in this study were two-tailed, and the threshold for statistical significance was set at p < 0.05.

For all analyses, “n” refers to the number of subjects, as specified in figure legends. Replicates were defined as independent biological replicates, corresponding to different subjects each performing the same experimental procedure.

For the identification of channels showing significant change responses (e.g., topographic plot of RM in Supplementary Fig. 2a), we computed the relative RM of change responses for each channel and subject and performed a two-tailed one-sample t-test against zero. P-values were corrected for multiple comparisons across channels using the Benjamini-Yekutieli false discovery rate (FDR) procedure58. Significant channels were marked with black dots in topographic plots (e.g., Supplementary Fig. 2a and Fig. 3c, f, and i).

To examine the difference in cP1 responses under Reg4-4.01 between correct and incorrect trials in the No-gap behavioral task (n = 10 for change detection ratio between 0.3 and 0.7), we used two-tailed Wilcoxon signed-rank tests on the average cP1 amplitude (calculated over [138, 168] ms relative to the change point and centered around the estimated cP1 peak; see Supplementary Fig. 4) for each channel.

For other pairwise comparisons (RM and behavioral comparisons), statistical significance was tested using two-tailed paired t-tests. Exact t-values, degrees of freedom, and effect sizes (Cohen’s d) are reported in the Results section. Cohen’s d was computed as:

$$d=\frac{\bar{X}-\bar{Y}}{\sqrt{\frac{1}{N-1}\mathop{\sum }_{i=1}^{N}{\left({X}_{i}-{Y}_{i}-\left(\bar{X}-\bar{Y}\right)\right)}^{2}}}$$

where \(X\) and \(Y\) are paired samples with a total number of \(N\) (with \(Y\) = 0 for one-sample t-test). \({X}_{i}\) and \({Y}_{i}\) are the \(i\)th elements from the paired samples. \(\bar{X}\) and \(\bar{Y}\) are mean values of the samples.

To test for tuning effects of RM across conditions (Reg4-4.01, Reg4-4.02, Reg4-4.03, and Reg4-4.06) in the No-gap and Gap tasks, one-way ANOVAs were performed. Effect size for the one-way ANOVA was measured by eta-squared (\({\eta }^{2}\)), defined as:

$${\eta }^{2}=\frac{S{S}_{{between}}}{S{S}_{{total}}}$$

where \(S{S}_{{between}}\) is the sum of squared variation due to the factor (between groups) and \(S{S}_{{total}}\) is the total sum of squares.

To test the relationship between EEG responses and the level of consciousness, we computed Pearson’s correlation coefficients between CRS-R scores from 20 coma subjects and their GFP responses following sound onset and sound change. No significant correlations were observed for either onset or change responses.

In addition, Bayesian statistical analyses were conducted to quantify evidence for the null hypothesis across different types of comparisons. For pairwise comparisons, we used Bayesian paired-sample or independent-sample t-tests with a Jeffreys-Zellner-Siow (JZS) prior, as described by Rouder et al. (2009)59. For group comparisons involving more than two levels, we employed Bayesian one-way ANOVA, using default Cauchy priors on effect sizes (Rouder et al.)60. To assess associations between continuous variables, we conducted Bayesian correlation analyses based on a Jeffreys prior for Pearson’s correlation coefficient (Ly et al.)61. All Bayes factors (BF10) were computed using the BayesFactor toolbox62 for MATLAB. A BF10 > 1 indicates evidence in favor of the alternative hypothesis, whereas a BF10 < 1 indicates evidence in favor of the null.