Abstract
Personalized auditory gamma stimulation may sharpen human cognition; however, behavioral evidence remains sparse. Here, we recruited 29 healthy adults (15 men; age range, 20–49 years) with no self-reported history of psychiatric disorders, and identified each participant’s dominant gamma frequency (IGF), and then investigated whether auditory stimulation at that frequency—delivered through individualized IGF music—modulated cognitive performance. We first estimated each participant’s IGF based on electroencephalogram during a 5‑min chirp‑music sweep. The identified frequency was embedded into a musical track (IGF music); a spectrally matched track lacking gamma enhancement served as control music. Participants completed five immediate recall trials and a long-delay-free recall of a verbal learning task, a visual change‑detection task, a card matching game, and a bivalent shape task while listening to either music track (within‑subject). Linear mixed-effects models revealed that IGF music increased word recall in the fifth immediate recall trial and reduced inverse efficiency in incongruent trials of the bivalent-shaped task, whereas other outcomes were unchanged. Therefore, auditory gamma stimuli may improve short-term memory and executive control. These findings highlight the potential of individualized sensory stimulation as a promising, non-invasive approach to ameliorate cognitive impairments associated with various neuropsychiatric and neurodegenerative disorders.
Introduction
Improving cognitive deficits associated with neuropsychiatric and neurodegenerative disorders in humans remains a challenge. Among the candidate electroencephalogram (EEG) biomarkers to aid in the diagnosis, prognostication, and monitoring of these conditions, oscillatory activity in the gamma band (≥ 30 Hz) has received particular attention. A growing body of evidence has linked gamma-band activity to higher-order brain functions, including cognition, consciousness, sensory integration, encoding, maintenance, and retrieval of short-term and episodic memories1,2,3,4,5,6,7,8,9,10,11. Aberrant gamma-band activity has also been documented in several neuropsychiatric disorders, such as Alzheimer’s disease (AD) and schizophrenia12,13. In AD, resting-state gamma synchrony is reduced14,15, and the latencies of gamma-band activities are prolonged relative to those in healthy controls16, whereas task-related gamma-band activity is attenuated in schizophrenia during attention, memory, and object-representation tasks17.
Beyond serving as a biomarker, gamma-band activity is a potential therapeutic target. In transgenic AD mouse models, entrainment of 40 Hz oscillations through visual flicker or combined audiovisual stimulation decrease amyloid-β deposition and ameliorated cognitive performance18,19. Multisensory gamma stimulation improves brain-wide connectivity, particularly in the hippocampus and prefrontal cortex, and exerts protective effects on neurons and glia20.
Although these findings originate largely from animal studies, several human studies have suggested comparable benefits. For instance, gamma-band repetitive transcranial magnetic stimulation (rTMS) improves cognitive scores for up to 8 weeks in patients with prodromal AD21. Transcranial alternating current stimulation (tACS) in the gamma frequency range enhances fluid intelligence in healthy individuals, with the largest gains observed in individuals with low baseline performance22, and boosts multiple indices of episodic memory in patients with mild cognitive impairment (MCI)23. Combined visual and auditory gamma stimulation slows cortical atrophy, preserves functional connectivity, and improves associative memory task performance in early stage AD24.
To date, almost all gamma entrainment studies have employed a 40-Hz stimulus. Galambos et al. demonstrated that the human auditory steady-state response (ASSR) is maximal at approximately 40 Hz25, establishing a de facto standard on which subsequent ASSR and gamma entrainment work have largely been based. Most ASSR studies have adopted 40 Hz stimulation and consistently reported a robust and reliable response at this frequency. Likewise, the rodent studies that achieved reductions in amyloid β protein deposition following visual or auditory stimulation employed a 40-Hz carrier, further intensifying interest in 40 Hz gamma stimulation18,19. The latest research has uncovered a novel mechanism of action, revealing that gamma stimulation enhances the function of the glymphatic system and facilitates the physical clearance of amyloid β26, thus raising further expectations for its therapeutic potential. However, the peak gamma frequency varies across individuals. This individual gamma frequency (IGF)27—the specific frequency that elicits the maximal neural response in a given person— generally falls between 30 Hz and 50 Hz.
IGF levels decline with age28,29 and shift to lower frequencies in patients with schizophrenia30,31. The effectiveness of IGF-targeted gamma stimulation has also been suggested in tACS studies. tACS delivered at a frequency near each participant’s IGF improved performance on an auditory gap detection task32, implying that IGF reflects an individual’s optimal perceptual processing frequency and can be externally leveraged. However, none of the studies have examined whether auditory IGF stimulation modulates cognitive performance.
We previously developed “Gamma Music,” a natural-sounding musical piece that embeds a 40-Hz auditory carrier while preserving musicality33. Extending this concept to individualized stimulation could provide a low-burden, user-friendly means of enhancing human cognition, with potential applications not only in clinical settings, but also in everyday wellness, business productivity, and education. If auditory gamma stimulation at a specific IGF can transiently or persistently boost cognitive function, it might serve as a learning-efficiency aid for students as well as individuals with learning disabilities, and ultimately contribute to interventions for conditions such as AD.
In the present study, we identified individual dominant gamma frequency (IGF) of healthy adults and then investigated whether auditory stimulation at that frequency—delivered through individualized IGF music—modulated their cognitive performance. To efficiently determine IGF, we employed a chirp stimulus, a brief signal whose frequency sweeps continuously across the gamma range, enabling rapid estimation of the frequency that elicits the maximal EEG response31,34,35. We embedded this chirp into a musical context (“Chirp Music”) to localize the IGF, and subsequently generated IGF music by accentuating the identified frequency component. This approach enabled us to present a participant-specific gamma stimulus and explore its cognitive impact in an ecologically valid setting. In addition to the IGF condition, we implemented a control condition in which participants listened to control music, an otherwise identical track lacking gamma-band modulation. Each participant completed a battery of cognitive tasks in both conditions. By comparing task performance between the IGF and control music, we aimed to delineate which cognitive functions were susceptible to modulation by individualized gamma-band auditory stimulation.
Results
Timeline for experimental protocol
The experimental timeline is shown in Fig. 1. First, each participant’s IGF was identified based on a 5-minute EEG recording obtained while listening to chirp music. In the IGF condition, participants listened for 5 min to IGF music tailored to their own IGF, whereas in the control condition, they listened to control music with no gamma-band enhancement.
The cognitive tasks were administered as follows. Participants completed five rounds of the verbal learning task (VLT), each consisting of an encoding phase followed by immediate recall (IR). They then performed the visual change-detection task (VCDT), a card-based matching game, the bivalent shape task (BST), and the long-delay-free recall (LDFR) phase of the VLT. Each task was carried out once under the IGF condition and once under the control condition, with the order of the conditions counterbalanced across participants. Participants were randomly assigned to one of two groups: one group (n = 15) completed the IGF condition first, followed by the control condition, while the other group (n = 14) completed the conditions in the reverse order.
Timeline for the experimental protocol. We first identified each participant’s individual gamma frequency (IGF) during a 5‑min EEG recording while they listened to chirp music. They then heard either IGF music or control music for 5 min, completed the encoding phase of the verbal learning task (VLT), and performed the full cognitive battery with the same music playing in the background. After finishing the long‑delay free recall (LDFR), participants took a 10‑min break and subsequently repeated the entire procedure under the alternate music condition. Participants were counterbalanced across the two conditions: IGF First and Control First.
EEG analysis
The participants listened to chirp music in which a chirp signal sweeping from 30 to 60 Hz was embedded to determine the IGF that elicited the greatest neural response. The EEG data were subjected to time–frequency analysis, and the frequency showing the maximal gamma-band response over time was designated as the IGF. The distribution of IGFs across participants is provided in Supplementary Figure S1, whereas full time–frequency representations for all participants are shown in Supplementary Data S1. The mean IGF was 45.6 ± 5.3 Hz (mean ± standard deviation [SD], n = 29).
Behavioral score
Table 1 summarizes the behavioral results for each task. For the VLT and matching games, the table reports the number of correct responses. For VCDT and BST, the performance was expressed as the inverse efficiency score (IES), a composite metric combining reaction time and accuracy. In the BST, the IES was calculated separately for congruent trials, where the shape and color of the target matched the response cue shown at the bottom of the screen, and for incongruent trials, where neither shape nor color matched. All values are presented as mean ± SD across participants.
Statistical analysis by model selection
To assess behavioral differences between the IGF and control conditions, we used a model selection approach based on linear mixed-effects modeling (LME). Each model included two fixed-effects models and one random effects model. The models were constructed as follows:
-
Music: IGF music vs. control music.
-
ExpOrder: First vs. second (to account for potential practice effects, because each task was completed twice).
-
sub: Participants (random intercept).
Using Wilkinson notation, the maximal model was as follows:
Five nested models, from the saturated model to the null model, were fitted to each behavioral outcome, and the best-fitting model was selected based on the Akaike information criterion (AIC)36. Supplementary Table S1 lists the AIC values for all the candidate models. Table 2 presents the model for each task selected according to the AIC. Supplementary Tables S2–S7 provide the parameter estimates with 95% confidence intervals and the detailed analysis of variance results for all selected models other than the null model.
To evaluate memory performance once learning had stabilized, we analyzed the IES in the fifth (final) trial for the IR of the VLT. As shown in Table 2, the interaction model (Music × ExpOrder) was selected for the IR5 of the VLT. In contrast, VCDT was best explained by a model that included only ExpOrder as the fixed effect. Neither the main effect of music nor its interaction with ExpOrder improved the model fit for the VCDT. Post‑hoc contrasts revealed that the second session yielded a significantly lower IES than the first session (first–second 95% CI [31.4, 113], Fig. 2A), indicating better performance in the second administration. For the incongruent trials of the BST, a model that included only the main effect of music was selected; neither the main effect of ExpOrder nor the interaction term improved the model fit. As shown in Fig. 2B, the IGF condition yielded a significantly lower IES than the control condition (Control–IGF 95% CI [3.3, 36.5]), indicating better performance with individualized gamma stimulation.
Behavioral results of tasks. (A) VCDT, (B) BST Incongruent, (C) IR5 of VLT (Interaction of Music and ExpOrder), (D) IR5 of VLT (Listening time). Data represent the mean and standard deviation.
For the IR5 of the VLT, the interaction model (Music × ExpOrder) was favored. Post‑hoc contrasts revealed that the Control First session produced fewer recalls than any other session (Control First–IGF First 95% CI [−2.82, −0.15]; Control First–Control Second 95% CI [−2.89, −0.22]; Control First–IGF Second 95% CI [−2.24, −0.05], Fig. 2C). This pattern may be attributable to the fact that Control First was the only session in which participants had not yet been exposed to IGF music before encoding.
Following the primary model selection, we conducted two additional exploratory analyses to better understand the nature of the observed effects. First, we constructed an additional model for IR5 of the VLT to explicitly account for exposure to IGF music. Although the previous analysis included ExpOrder to correct for practice effects, it did not consider the duration of IGF‑Music exposure accumulated before each IR trial. For example, the Control First condition involved no prior exposure to IGF music, whereas the Control Second condition followed the completion of the IGF session, amounting to approximately 5 min of passive listening plus approximately 10 min during task execution. Likewise, IGF First and IGF Second provided 5 min of IGF‑Music exposure before the IR block.
To test whether this discrepancy explained the poorer performance in the Control First group (Fig. 2), we replaced Music and ExpOrder with a categorical predictor Listening Time—with three levels: None, Briefly (~ 5 min), and Full (~ 15 min in total)—and fitted the following model:
Supplementary Tables S8 and S9 provide the parameter estimates with 95% confidence intervals and the detailed analysis of variance results for the model, and Fig. 2D shows the contrasts. Both Briefly and Full yielded significantly higher recall numbers than None (Briefly–None 95% CI [0.35, 2.16]; Full–None 95% CI [0.27, 2.57]). The AIC of the model was 200.93, indicating a better fit than the previous model containing the interaction of music and order (AIC = 202.43).
Second, to further investigate individual differences, we constructed an additional model that included age as a covariate in the models for the incongruent BST and the IR5 of the VLT. In our primary model selection, the best model for IR5 included the Music × ExpOrder interaction, while the best model for the incongruent BST included the main effect of Music. To investigate the potential influence of age on the effect of Music, we conducted a secondary analysis where we added a Music × Age interaction term to these models.
The resulting models were IR5 ~ Music*ExpOrder + Music*Age + (1|sub) and incongruent BST ~ Music*Age + (1|sub). The AIC values for these new models were 202.34 and 669.47, respectively. In both cases, the AIC of the more complex model including the age interaction did not improve by more than 2 points compared to the AIC from the primary model selection. Therefore, we could not conclude that changes with aging were responsible for the magnitude of the Music effect.
Discussion
To the best of our knowledge, this is the first study to show that auditory IGF stimulation alone can modulate human cognition. For IR5 of the VLT, a model that included the interaction between Music and ExpOrder was selected. For the incongruent BST condition, the optimal model contained only the main effect of music. In both cases, the IGF condition outperformed the control. In contrast, null models were preferred for the LDFR of the VLT, the Matching Game, and congruent condition of the BST, whereas the VCDT was best explained by a model with ExpOrder alone, and music had no explanatory power in that task. Future investigations should clarify the neural mechanisms and identify the cognitive domains that derive the greatest benefits from this approach.
In the BST, incongruent trials, in which the color or shape of the target mismatched the response cue, imposed greater cognitive demands on attentional allocation and inhibitory control, which typically manifested as poorer performance in composite measures such as the IES. Therefore, the lower IES observed under IGF condition in incongruent trials suggests that individualized gamma stimulation facilitates these executive processes. In congruent trials, the IGF and control conditions produced comparable IES values, presumably due to a ceiling effect, whereby the task was already at an easy difficulty level and could not benefit further from gamma stimulation. These findings indicate that IGF‑aligned auditory gamma stimulation preferentially enhances performance under high cognitive load, specifically when efficient attentional allocation and inhibition are required.
For IR5 of the VLT, we tested an additional hypothesis by replacing the Music × ExpOrder interaction with a fixed-effect predictor that captures exposure to IGF music (Listening Time). The resulting model yielded a lower AIC (200.93) than that of the interaction model (202.43). Although the ΔAIC was only 1.5—below the conventional threshold of 2 for decisive evidence37—its parsimony and modest improvement suggested that recall performance might be better explained by the amount of IGF‑Music exposure. This finding implies that the memory-improving effects of IGF‑aligned stimulation do not vanish immediately but may persist for at least a short time. However, the results of the VLT also have aspects that require a cautious interpretation. The effect of IGF music on immediate recall emerged only in the IR5, a point where performance was approaching ceiling levels. Therefore, rather than reflecting an enhancement of overall learning capacity, this finding might indicate a more subtle effect, such as the stabilization of memory traces or more efficient retrieval under high memory load once learning has plateaued.
To explore potential individual differences that might modulate the efficacy of the intervention, our secondary analysis investigated whether age influenced the specific effects observed in the incongruent BST and VLT IR5. Our results indicated that age was not a significant moderating factor for these outcomes; we found no significant interaction between the Music condition and age in the models for these two tasks. This suggests that, for the enhancements to executive control and verbal memory recall that we observed, the magnitude of the benefit was consistent across our sample’s age range (20–49 years). However, this analysis was limited to the tasks where a primary effect of music was found. The hypothesis that older adults, who may have lower baseline gamma function, might derive greater benefit remains plausible and warrants dedicated investigation in future studies specifically recruiting an older cohort.
Our observations align with the findings of previous work showing gamma-induced enhancement in short-term memory. For instance, Rufener et al.23 delivered 40 Hz tACS over five weeks to patients with mild cognitive impairment and reported increased recall scores on the California verbal learning test (CVLT), whereas long-term recall remained unchanged. Consistent with this study, we selected a null model for the LDFR phase, which indicated a selective effect on short-term memory. Rufener et al. further demonstrated altered functional connectivity between the inferior parietal lobule and hippocampus in responders, implicating hippocampal mechanisms. Although we did not measure connectivity, our behavioral results are compatible with the hypothesis that IGF‑Music stimulation engages hippocampal‑centered networks via sensory pathways to bolster short‑term memory. Future work combining IGF‑Music with neuroimaging or electrophysiology could directly test this mechanism. Gamma oscillations arise from a finely tuned balance between cortical excitation and inhibition, often termed E/I balance38. The IGF is believed to represent the point of optimal balance; precisely driving the brain at this frequency can further stabilize the E/I balance. Such stabilization may underlie the performance gains observed in the incongruent trials of the BST and the fifth IR trial of the VLT.
The absence of effects in certain tasks, such as the VCDT and the matching game, is informative as it helps to constrain the interpretation of our findings and suggests a degree of specificity in the effects of IGF stimulation. One potential explanation for this pattern is the modality of the stimulation. As an auditory stimulus, IGF music may preferentially modulate activity within auditory-verbal processing networks and domain-general executive control circuits, while having a less direct impact on networks primarily supporting visuospatial working memory, which are central to the VCDT and matching game.
Furthermore, the nature of the cognitive load may be a critical factor. The tasks that benefited from IGF stimulation (the incongruent BST and the fifth trial of the VLT) demand not just memory maintenance but also high-level executive processes such as inhibitory control and memory consolidation under significant cognitive load. It is plausible that auditory gamma stimulation is most effective at enhancing these specific functions, which may have been less taxed in the visuospatial working memory tasks. Thus, these null findings suggest that the cognitive enhancements from auditory IGF stimulation are not global but are specific to particular cognitive domains and task demands, a key direction for future research.
Despite its novel findings, this study has several limitations that suggest important avenues for future research. First, our study was conducted on a relatively small sample of healthy young adults. Therefore, the generalizability of our findings to other populations, particularly older adults or patients with neuropsychiatric disorders such as MCI or Alzheimer’s disease, remains to be determined. Future research should investigate the efficacy of individualized IGF stimulation in these clinical populations, for whom such a non-invasive intervention could have significant therapeutic potential.
Second, while our findings suggest that the cognitive enhancement from IGF stimulation can persist for 20–30 min after the experimental condition, its durability over longer periods, such as several hours or across multiple days, remains unknown. Future longitudinal studies are therefore needed to assess the long-term sustainability of these benefits and to determine optimal stimulation protocols for producing lasting improvements.
Third, our experimental design cannot definitively isolate the effects of personalization from the more general effects of gamma-band stimulation. We compared stimulation at each participant’s IGF against a control condition that lacked any salient gamma-band enhancement. While this design demonstrates that IGF stimulation is more effective than no stimulation, it does not rule out the possibility that a similar cognitive benefit could be achieved with any non-personalized gamma frequency. To establish that the observed effects are truly specific to each participant’s dominant gamma response, future studies should incorporate additional control conditions. For instance, comparing the effects of IGF stimulation against those of a standardized frequency (e.g., 40 Hz) would provide a more rigorous test of the personalization hypothesis.
Fourth, our methodological choices for EEG acquisition warrant discussion. We employed a single-channel EEG setup (FCz) to identify each participant’s IGF. This approach was chosen for two primary reasons: (1) theoretical focus, as the ASSR is robustly and reliably recorded from fronto-central sites, making FCz a sufficient location for our specific goal of identifying the peak response frequency; and (2) practical applicability, as a simpler, portable setup holds greater promise for future real-world applications. However, we acknowledge that this single-electrode approach limits the spatial resolution and may be more susceptible to noise compared to a more comprehensive multi-channel EEG setup. While our artifact control procedures were designed to mitigate this, future research could benefit from using high-density EEG to enhance the precision and reliability of IGF estimation.
Fifth, we relied on self-reports to confirm normal hearing abilities rather than on objective audiometric testing. This approach cannot exclude the possibility that individuals with undiagnosed, subclinical hearing loss, particularly in the frequency ranges relevant to our gamma-band stimuli, were included in the sample. Such deficits could act as a potential confound by influencing how the auditory stimulation was perceived and processed, thereby affecting the cognitive outcomes. Therefore, future investigations should incorporate objective audiological screening to ensure participant suitability and increase the reliability of the findings. Additionally, the subjective loudness of the IGF and control music was not controlled for. Consequently, we cannot exclude the possibility that differences in perceived intensity between the conditions influenced the cognitive outcomes.
Sixth, our exploratory reanalysis introducing the “Listening Time” factor must be interpreted with caution. A significant confound exists, particularly concerning the “Full” exposure condition, where increased exposure to IGF music was perfectly confounded with general task practice (i.e., having completed one full experimental block). Consequently, we cannot unequivocally attribute the enhanced performance in this condition to the persistence of the music’s effects alone. However, the comparison between the “None” condition (Control First) and the “Briefly” condition (IGF First) is not subject to this confound, as both occurred during the participants’ initial session and were equally free from general practice effects. The significant improvement observed in the “Briefly” condition relative to the “None” condition therefore provides strong, unconfounded evidence that a short 5-minute exposure to IGF music is sufficient to enhance immediate cognitive performance. While this supports our main hypothesis, future studies designed to fully disentangle exposure duration from practice effects, for instance by including a task familiarization session, are needed to rigorously test the persistence of these benefits over time.
Seventh, it is also important to consider the statistical power of our study given the sample size of 29 participants. Our post-hoc power analysis provided two key insights. First, for the significant effects observed in the incongruent BST and the IR5 of the VLT, the statistical power was 66.8% and 63.6%, respectively. While these values are below the conventional 80% threshold, the detection of significant effects under these conditions suggests that the observed cognitive enhancements are likely robust. Second, for the tasks where no significant effect of Music was found (congruent BST, VCDT, matching game, and LDFR of VLT), the post-hoc power was extremely low (all below 7%). Future studies with larger sample sizes are needed to test for effects in these other cognitive domains definitively.
Determining IGF requires EEG recordings, and EEG is one of the most accessible and cost-effective noninvasive neuroimaging modalities. Recent advances in neurotechnology have miniaturized and lowered the cost of EEG hardware, making at-home and point-of-care measurements feasible. Consequently, IGF stimulation is not only technically feasible for large-scale deployment but also holds promise for applications ranging from consumer cognitive enhancement to clinical interventions for neuropsychiatric and neurodegenerative disorders.
Methods
Participants
This study was approved by the Shiba Palace Clinic Ethics Review Committee (approval number: 156831_rn-38608) and was conducted in accordance with the Declaration of Helsinki. All participants provided written informed consent prior to the experiment.
Twenty-nine individuals (15 men and 14 women; age range, 20–49 years) participated in this study. Participants were required to have normal or corrected-to-normal vision and normal hearing, which they confirmed before the main tasks. Exclusion criteria included any history of neurological or psychiatric disorders and current use of psychoactive medication.
Chirp music
To efficiently determine each participant’s IGF, we employed a chirp stimulus, whose utility has been previously demonstrated31,34,35. To reduce the participant burden associated with repetitive chirp presentations, we created chirp music by embedding the chirp signal into an original musical score created by VIE, Inc. (Kamakura, Japan).
The gamma stimulus train consisted of 1.5 ms white‑noise bursts presented with alternating polarity. The inter-click interval was systematically varied to sweep monotonically from 60 Hz to 30 Hz and back to 60 Hz, thereby covering the 30–60 Hz gamma range. Each sweep lasted 1500 ms, and the sweep was repeated 46 times with a stimulus-onset asynchrony of 6000 ms. The final musical piece lasted approximately 5 min 40 s (340 s). The audio file for the chirp music is provided in Supplementary Audio S1.
IGF and control music
We previously developed Gamma Music that embeds gamma-band stimulation within a musical composition33. Among its constituent elements, the Gamma keyboard most prominently enhances the 40 Hz component and receives the highest subjective preference ratings from participants33.
Building on this work, the IGF music used in the present study was tailored to each participant based on their unique neural response. To achieve this, we first identified each participant’s IGF by analyzing an EEG recording captured while they listened to a chirp music. The IGF was defined as the frequency that elicited the maximal phase-locking factor and evoked power. We then selected the appropriate stimulus from a pre-generated library of 31 versions of our “Gamma Keyboard” sound element. Each version was engineered to selectively enhance a single frequency between 30 and 60 Hz in 1-Hz steps. For example, if a participant’s IGF was determined to be 45 Hz, the music track featuring the Gamma Keyboard with the 45-Hz enhancement was chosen as their “IGF music”.
In the control condition, we presented control music derived from a Control keyboard that contained no salient enhancement in the gamma range.
The gamma-frequency component was embedded via amplitude modulation, not by the addition of a low-frequency tone. This acoustic structure is visualized in Fig. 3. The standard Fast Fourier Transform (FFT) plot (Fig. 3A, middle) displays the musical frequencies of the keyboard notes themselves (i.e., their pitches and harmonics). The Envelope FFT plot (Fig. 3A, bottom), conversely, analyzes the rhythm of the loudness fluctuations. The distinct peak at the target frequency in this plot is direct evidence of the embedded amplitude modulation, a feature absent in the control stimulus (Fig. 3B). More detailed specifications of the 40 Hz IGF music/gamma Keyboard and the Control keyboard are provided in a study by Yokota et al. (2024)33.
Acoustic properties of the Gamma Keyboard (A) and Control Keyboard (B). For each stimulus, the panels show: (Top) a partial waveform over time, (Middle) the frequency spectrum, which illustrates the musical pitch content, and (Bottom) the envelope frequency spectrum, which illustrates the rhythm of the volume modulation. The distinct peak at the target frequency in the bottom panel of (A) represents the embedded gamma stimulation, which is absent in the control stimulus (B).
Experimental procedure
Each session began with a 5‑min EEG recording while participants listened to chirp music, a proprietary stimulus created by VIE Inc. that embeds a frequency‑swept chirp (30–60 Hz) within a musical piece. After identifying the IGF, the participants completed cognitive tasks under two conditions: IGF which amplified the participant’s IGF component; and control, which contained no salient gamma-band enhancement.
In both conditions, participants listened to the assigned music for 5 min, followed by five rounds of the IR phase of the VLT. While continuing to listen to the same music, they performed the VCDT, the Matching Game, and the BST in that order, and finally completed the LDFR phase of the VLT. A 10‑min rest period separated the IGF and control conditions. During all music presentations (chirp music, IGF music, and control music), the participants kept their eyes closed and maintained a relaxed posture. During EEG acquisition, participants were monitored to ensure the absence of unnecessary body movements and abnormal EEG waveforms.
VLT - IR
The CVLT39 is a prototypical verbal learning test. As the CVLT word list was developed for English-speaking Americans, a direct Japanese translation could not adequately control for cross-cultural differences in word familiarity. Therefore, we employed the Japanese word list developed by Takeda40, which balances word difficulty level and imagery ratings and was modeled on the auditory verbal learning test (AVLT)41. The list comprises 15 words drawn from distinct semantic categories and is available in two parallel versions. One version was assigned to the IGF condition and the other to the control condition, which was counterbalanced across participants.
IR was performed five times for each music condition. In each trial, the entire 15-word list was read aloud at a constant pace, after which the participants freely recalled as many words as possible. Approximately 15 s after recall ceased—or when the participant indicated they could remember no more words—a single prompt (“Anything else?”) was given. All five IR trials were completed even if the participant recalled all 15 words within the trial. Upon completing the IR block, the participants performed the VCDT, the Matching Game, and the BST. As these three tasks are non-verbal, they did not interfere with the subsequent assessments of verbal memory.
VCDT
The VCDT is widely used to probe visual short-term memory, working memory, and visual attention. A schematic of the task sequence is shown in Fig. 4. Each trial began with a 1000‑ms blank screen, followed by a 300‑ms sample array composed of several colored squares. After another 1000‑ms blank interval, a test array containing the same number of colored squares was presented. Participants compared the two arrays and pressed the left shift key if any square had changed and the right shift key if no change was detected.
The set size (3, 4, 8, or 12 squares) was randomly selected for every trial. A total of 72 trials were administered per music condition and the participants were instructed to respond as quickly and accurately as possible.
Time flow of VCDT. In each trial, an array of colored squares was presented twice in succession. Participants were instructed to indicate as quickly and accurately as possible whether the second array was identical to or different from the first. A total of 72 trials were administered for each music condition.
Matching game
Working memory was assessed based on the outcomes of a card-based matching game (Fig. 5). The stimulus set comprised 12 playing cards, two copies six different identities: Ace of Hearts, Two of Spades, Three of Diamonds, Four of Clubs, King of Spades, and a Joker. At the start of each trial, all cards were displayed face-up for 10 s to encourage memorization. The cards were then turned face‑down, and participants selected pairs sequentially by left‑clicking with the mouse. If the second card matched the first card, the pair was scored as correct, and an incorrect match immediately terminated the trial. Participants were instructed to begin with the cards that they felt most confident about. A new trial began when all cards were matched or when an error occurred. Ten trials were conducted under each music condition.
The theoretical maximum score was six correct pairs. However, once five pairs were matched, the remaining pairs were necessarily correct. To account for this ceiling effect, the maximum score was capped at five for statistical analysis.
Time flow of the Matching Game. For the first 10 s, all cards were displayed face‑up, and participants were instructed to memorize their locations. The cards were then turned face‑down, after which participants selected pairs in the order of their confidence—choosing two cards at a time. A trial ended either when all pairs were uncovered without error or immediately after the first mismatched pair. The task comprised 10 trials per music condition.
BST
This task was used to assess executive functions, including task switching and interference control. At the beginning of each block, participants were instructed to respond based on either the shape or color of the upcoming stimuli. Following a previous study35, the stimuli combined two shapes (circle or square) with two colors (red or blue), yielding four possible examples: red circles, red squares, blue circles, and blue squares. One of these stimuli was presented at the center of the screen in each trial. The response cues—a blue circle and red square—were displayed permanently at the bottom left and right of the screen.
When the central stimulus matched the response cue in both shape and color (congruent condition), responses were facilitated. When it was mismatched in either dimension (incongruent condition), participants had to suppress the irrelevant attribute (Fig. 6). The task instructions switched the relevant feature (shape vs. color) every 20 trials, and the left–right positions of the response cues were swapped every 40 trials. A total of 160 trials were administered per music condition, and the participants were instructed to respond as quickly and accurately as possible.
BST. Participants were instructed to respond to either the color or the shape of a stimulus presented at the center of the screen. Two reference images serving as response cues (e.g., a blue circle and a red square) were continuously displayed at the bottom of the screen. A trial was classified as congruent when the central stimulus matched the cue in both color and shape, and as incongruent when it mismatched on either dimension.
VLT - LDFR
After completing the BST, the participants were asked to recall the word list presented during the IR phase of the VLT. The procedure mirrored that of IR: once recall ceased—or when participants indicated they could remember no further words—an additional prompt (“Anything else?”) was presented once, 15 s after the pause. The interval between IR and LDFR was 10–15 min. When the first LDFR block was completed, the participants took a 10‑min break before repeating the entire protocol under alternate music conditions.
Experimental devices
EEG responses were recorded using a portable EEG device (Miyuki Giken, Bunkyō City, Tokyo, Japan; Polymate Pocket MP208) with an active electrode placed at the FCz, in accordance with the International 10–20 system. All captured signals were referenced to the right mastoid and the ground electrode was placed on the left. All electrode impedances were reduced to less than 40 kΩ. All signals were sampled at 500 Hz. All acoustic sources were transmitted to participants via headphones (Sony Corporation, Minato City, Tokyo, Japan; MDR-CD900ST). The delivery of all sounds was facilitated by a VLC media player (VideoLAN Organization, Paris, France).
The VCDT was implemented and presented using the Psychology Experiment Building Language (PEBL) test battery42. The Matching Game and the BST were implemented and presented using PsychoPy softwere43. Responses for the VCDT and BST were collected via keyboard presses, and responses for the Matching Game were collected via mouse clicks.
EEG analysis to extract IGF
EEG analyses were conducted using MATLAB 2023 (MathWorks, Natick, Massachusetts, USA) and Fieldtrip44 functions. Continuous EEG signals were digitally filtered using a finite-impulse response bandpass filter (1–70 Hz, order: 1500) and a notch filter (59–61 Hz, order: 1500) to remove power noise. To minimize contamination from ocular artifacts, participants were instructed to keep their eyes closed throughout the recording, a standard procedure known to significantly reduce blinking and saccadic movements. The EEG data were then segmented into 2300 ms epochs (−500 to 1800 ms) based on the auditory trigger marking the onset timing of the chirp stimulus embedded in the chirp music. Epochs exceeding ± 60 µV were removed from the analysis. Notably, no epochs were rejected for any participant under this criterion; thus, all 47 trials per participant were retained for the subsequent analysis.
Time–frequency transformation was conducted using a multi-taper method with a Hanning window, focusing on frequencies between 25 and 65 Hz in 1 Hz steps. Fourteen cycles were employed for each frequency, and the time axis ranged from − 200 ms (before stimulus onset) to 1600 ms (after stimulus onset) in 2-ms increments.
This study adopted the phase-locking factor (PLF) and evoked power as indices to determine the IGF of each participant. The PLF was obtained by performing a time–frequency analysis on single-trial EEG epochs (i.e., without trial averaging) and quantifying the inter-trial phase synchronization at each time–frequency point. The evoked power was derived by first averaging the EEG epochs across trials to produce an averaged waveform and then applying time–frequency analysis, which provides the power at every time and frequency point.
For both PLF and evoked power, responses collapsed within 75‑ms windows for every 1‑Hz increment between 30 and 60 Hz. Each window was aligned to the onset of the corresponding frequency in both the chirp-down and chirp-up segments and the data points falling within that window were averaged34. For each frequency, we took—separately for PLF and evoked power—the average of the chirp-down and chirp-up values, and the frequency that yielded the highest value was defined as the participant’s IGF.
Behavioral analysis
Behavioral indices were calculated to compare performance between the IGF and control conditions. For the VLT, the outcome measures were the number of words correctly recalled in the five immediate recall trials (IR1–IR5) and a single LDFR trial. For the VCDT, Matching Game, and BST, the performance was quantified using an IES45, which combines reaction time (RT) and accuracy. IES was computed as IES = Mean RT for correct trials/accuracy, where Accuracy is expressed as a proportion (0–1).
Statistical analysis for behavioral data
To explore which factors best explain cognitive performance in a data-driven manner, we adopted an information-theoretic model-selection approach. This method was chosen because it is well-suited for our exploratory research, as it allowed for the direct comparison of evidence for multiple competing models incorporating different sets of explanatory variables for each cognitive task. Model quality was quantified using Akaike’s Information Criterion (AIC)36:
where L is the maximum likelihood and k is the total number of free parameters, including both fixed and random effect terms.
Smaller AIC values indicate a better trade-off between the goodness of fit and model complexity. Following Burnham and Anderson37, models within ΔAIC ≤ 2 of the minimum were considered statistically indistinguishable; in such cases, the simplest model (fewest parameters) was retained. As AIC‑based model selection is philosophically distinct from Neyman–Pearson hypothesis testing that relies on a predetermined significance level (α), we did not perform additional null‑hypothesis tests on the models ultimately selected. Nevertheless, for transparency, we have reported the corresponding P‑values and 95% confidence intervals for each parameter estimate.
LME models were fitted with two fixed effects, Music (IGF vs. control) and ExpOrder (First vs. Second), and a random intercept for participants. The most complex specification can be written using Wilkinson notation:
The null model contained only an intercept and random effect. If no candidate model outperformed the null by ΔAIC ≥ 2, we concluded that the fixed effects did not meaningfully explain the behavioral outcome. Since AIC‑based selection differs fundamentally from hypothesis testing, we did not perform additional null‑hypothesis significance tests on the selected models; however, for transparency, we have reported P‑values and 95% confidence intervals alongside parameter estimates.
Following the primary model selection, we conducted two additional exploratory analyses to better understand the nature of the observed effects. First, to test whether the amount of IGF music exposure influenced VLT performance, we specified an additional model for IR5 in which the predictors Music and ExpOrder were replaced by the categorical variable Listening Time with three levels—None, Briefly (≈ 5 min), and Full (≈ 15 min total exposure):
Second, to investigate the potential influence of individual differences, we added age as a covariate to the models for the tasks that showed a significant music effect.
To contextualize the robustness of our findings and assess the risk of Type II errors for null results, we conducted post-hoc power analyses for the fixed effect of Music in each cognitive task. Power was calculated via simulation (1000 iterations) using the simr package in R by comparing the model with the Music term to the corresponding model without it.
Data availability
Raw EEG recordings contain potentially identifiable information. Since explicit participant consent for public release was not obtained, the raw data cannot be shared. However, the anonymized datasets underlying the tables and figures used in this study are available upon reasonable request. All inquiries should be addressed to the corresponding author (yokota@vie.style).
References
Tallon-Baudry, C., Bertrand, O., Peronnet, F. & Pernier, J. Induced γ-band activity during the delay of a visual short-term memory task in humans. J. Neurosci. 18, 4244–4254 (1998).
Gruber, T., Tsivilis, D., Montaldi, D. & Müller, M. M. Induced gamma band responses: an early marker of memory encoding and retrieval. Neuroreport 15, 1837–1841 (2004).
Herrmann, C. S., Munk, M. H. J. & Engel, A. K. Cognitive functions of gamma-band activity: memory match and utilization. Trends Cogn. Sci. 8, 347–355 (2004).
Jensen, O. & Lisman, J. E. Hippocampal sequence-encoding driven by a cortical multi-item working memory buffer. Trends Neurosci. 28, 67–72 (2005).
Mormann, F. et al. Phase/amplitude reset and theta-gamma interaction in the human medial Temporal lobe during a continuous word recognition memory task. Hippocampus 15, 890–900 (2005).
Osipova, D. et al. Theta and gamma oscillations predict encoding and retrieval of declarative memory. J. Neurosci. 26, 7523–7531 (2006).
Fries, P., Nikolić, D. & Singer, W. The gamma cycle. Trends Neurosci. 30, 309–316 (2007).
Lisman, J. Working memory: the importance of theta and gamma oscillations. Curr. Biol. 20, R490–R492 (2010).
Shirvalkar, P. R., Rapp, P. R. & Shapiro, M. L. Bidirectional changes to hippocampal theta-gamma comodulation predict memory for recent Spatial episodes. Proc. Natl. Acad. Sci. U S A. 107, 7054–7059 (2010).
Kucewicz, M. T. et al. Dissecting gamma frequency activity during human memory processing. Brain 140, 1337–1350 (2017).
Griffiths, B. et al. (ed, J.) Directional coupling of slow and fast hippocampal gamma with neocortical alpha/beta oscillations in human episodic memory. Proc. Natl. Acad. Sci. U S A 116 21834–21842 (2019).
Herrmann, C. S. & Demiralp, T. Human EEG gamma oscillations in neuropsychiatric disorders. Clin. Neurophysiol. 116, 2719–2733 (2005).
Mathalon, D. H. & Sohal, V. S. Neural oscillations and synchrony in brain dysfunction and neuropsychiatric disorders it’s about time. JAMA Psychiatry. 72, 840–844 (2015).
Stam, C. J. et al. Generalized synchronization of MEG recordings in alzheimer’s disease: evidence for involvement of the gamma band. J. Clin. Neurophysiol. 19, 562–574 (2002).
Koenig, T. et al. Decreased EEG synchronization in alzheimer’s disease and mild cognitive impairment. Neurobiol. Aging. 26, 165–171 (2005).
Başar, E., Emek-Savaş, D. D., Güntekin, B. & Yener, G. G. Delay of cognitive gamma responses in alzheimer’s disease. Neuroimage Clin. 11, 106–115 (2016).
Shin, Y. W., O’Donnell, B. F., Youn, S. & Kwon, J. S. Gamma Oscillation in schizophrenia. Psychiatry Investig. 8, 288–296 (2011).
Iaccarino, H. F. et al. Gamma frequency entrainment attenuates amyloid load and modifies microglia. Nature 540, 230–235 (2016).
Martorell, A. J. et al. Multi-sensory gamma stimulation ameliorates Alzheimer’s-associated pathology and improves cognition. Cell 177, 256–271e22 (2019).
Adaikkan, C. et al. Gamma entrainment binds higher-order brain regions and offers neuroprotection. Neuron 102, 929–943e8 (2019).
Liu, C. et al. Modulating gamma oscillations promotes brain connectivity to improve cognitive impairment. Cereb. Cortex. 32, 2644–2656 (2022).
Santarnecchi, E. et al. Individual differences and specificity of prefrontal gamma frequency-tACS on fluid intelligence capabilities. Cortex 75, 33–43 (2016).
Jones, K. T. et al. Gamma neuromodulation improves episodic memory and its associated network in amnestic mild cognitive impairment: A pilot study. Neurobiol. Aging. 129, 72–88 (2023).
Chan, D. et al. Gamma frequency sensory stimulation in mild probable alzheimer’s dementia patients: Results of feasibility and pilot studies. PLoS One 17, e0278412 (2022).
Galambos, R., Makeig, S. & Talmachoff, P. J. A 40-Hz auditory potential recorded from the human scalp. Proc. Natl. Acad. Sci. U S A. 78, 2643–2647 (1981).
Murdock, M. H. et al. Multisensory gamma stimulation promotes glymphatic clearance of amyloid. Nature 627, 149–156 (2024).
Picton, T. W., John, M. S., Dimitrijevic, A. & Purcell, D. Human auditory steady-state responses. Int. J. Audiol. 42, 177–219 (2003).
van Pelt, S., Shumskaya, E. & Fries, P. Cortical volume and sex influence visual gamma. Neuroimage 178, 702–712 (2018).
Güntekin, B. et al. Alterations of resting-state gamma frequency characteristics in aging and alzheimer’s disease. Cogn. Neurodyn. 17, 829–844 (2023).
Arnfred, S. M., Raballo, A., Morup, M. & Parnas, J. Self-Disorder and brain processing of proprioception in schizophrenia spectrum patients: A Re-analysis. Psychopathology 48, 60–64 (2015).
Griskova-Bulanova, I. et al. Envelope following response to 440 hz carrier chirp-modulated tones show clinically relevant changes in schizophrenia. Brain Sci. 11, 22 (2020).
Baltus, A., Vosskuhl, J., Boetzel, C. & Herrmann, C. S. Transcranial alternating current stimulation modulates auditory temporal resolution in elderly people. Eur. J. Neurosci. 51, 1328–1338 (2020).
Yokota, Y. et al. Gamma music: A new acoustic stimulus for gamma-frequency auditory steady-state response. Front. Hum. Neurosci. 17, 1287018 (2024).
Mockevičius, A. et al. Extraction of individual EEG gamma frequencies from the responses to click-based chirp-modulated sounds. Sensors 23, 2826 (2023).
Griškova-Bulanova, I. et al. Responses at individual gamma frequencies are related to the processing speed but not the inhibitory control. J. Pers. Med. 13, 26 (2022).
Akaike, H. A new look at the statistical model identification. IEEE Trans. Automat Contr. 19, 716–723 (1974).
Burnham, K. P. & Anderson, D. R. Multimodel inference: Understanding AIC and BIC in model selection. Sociol. Methods Res. 33, 261–304 (2004).
Bartos, M., Vida, I. & Jonas, P. Synaptic mechanisms of synchronized gamma oscillations in inhibitory interneuron networks. Nat. Rev. Neurosci. 8, 45–56 (2007).
Delis, D. C., Freeland, J., Kramer, J. H. & Kaplan, E. Integrating clinical assessment with cognitive neuroscience: Construct validation of the California verbal learning test. J. Consult Clin. Psychol. 56, 123–130 (1988).
Takeda, K., Nakamura, H. & Tokuchi, R. Aging and verbal memory -an experimental study using structured and non-structured word lists. Nippon Ronen Igakkai Zasshi. 55, 117–123 (2018).
Vakil, E. & Blachstein, H. Rey auditory-verbal learning test: Structure analysis. J. Clin. Psychol. 49, 883–890 (1993).
Mueller, S. T. & Piper, B. J. The psychology experiment Building Language (PEBL) and PEBL test battery. J. Neurosci. Methods. 222, 250–259 (2014).
Peirce, J. et al. PsychoPy2: Experiments in behavior made easy. Behav. Res. Methods. 51, 195–203 (2019).
Oostenveld, R., Fries, P., Maris, E. & Schoffelen, J. M. FieldTrip: Open source software for advanced analysis of MEG, EEG, and invasive electrophysiological data. Comput. Intell. Neurosci. 2011, 1–9 (2011).
Bruyer, R. & Brysbaert, M. Combining speed and accuracy in cognitive psychology: Is the inverse efficiency score (IES) a better dependent variable than the mean reaction time (RT) and the percentage of errors (PE)? Psychol. Belg. 51, 5 (2011).
Funding
The authors received financial support for the research, authorship, and publication of this article. This study was financially supported by Vie, Inc. (https://www.viestyle.co.jp/en/) in the form of salaries received from Y.Y., M.C., K.T., S.K., S.F., Y.N., and Y.I.
Author information
Authors and Affiliations
Contributions
Y.Y., M.C., and Y.N. designed the study. Y.Y. performed the experiments and analyzed the data. Y.Y. wrote the main manuscript text and prepared all figures and tables. K.T. developed the music dataset. S. K., S. F., and Y. I. supervised the study. Y.N. provided critical revisions. All authors have approved the manuscript.
Corresponding author
Ethics declarations
Competing interests
Y.Y., M.C., K.T., S.K., S.F., Y.N., and Y.I., who were associated with this research, were employed by Vie, Inc. The authors would like to declare the following patents associated with this research: PCT/JP2024/009594, JP Appl. No. 2023-571398, and JP Appl. No. 2024-154181. The authors declare the following products of the development of this research: VIE Tunes and VIE Tunes Pro. All the remaining authors declare no conflict of interest.
Ethics statement
Our study was approved by the Shiba Palace Clinic Ethics Review Committee. The study was conducted in accordance with local legislation and institutional requirements. All the participants provided written informed consent to participate in this study.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Yokota, Y., Tanaka, K., Chang, M. et al. Auditory stimulation at individual gamma frequency enhances cognitive performance. Sci Rep 15, 38697 (2025). https://doi.org/10.1038/s41598-025-22360-0
Received:
Accepted:
Published:
Version of record:
DOI: https://doi.org/10.1038/s41598-025-22360-0





