Introduction

Visual learning, which refers to the enhancement of visual skills after training1, has been widely utilized to study adult neural plasticity in the visual system2,3,4. Like other forms of learning, visual learning involves two key stages: encoding and consolidation. Previous research has mainly focused on the encoding stage. Learning-induced changes in neural encoding manifested in various forms, including sharpened tuning curve5,6,7, increased response gain8,9,10, reduced noise correlation11, and so on. Meanwhile, growing evidence shows that consolidation also plays a critical role in the efficacy of visual learning12,13,14. Consolidation transforms fragile learning traces into long-term storage, resulting in improved behavioral performance, often considered as an “offline” gain.

Consolidation occurs not only during sleep but also during wakefulness. While most research has focused on sleep consolidation15, our understanding of wakeful consolidation remains limited. Wakeful consolidation occurs within a temporal window following training and is characterized by unstable learning traces. For instance, introducing an interfering task after training disrupted wakeful consolidation, leading to a subsequent deterioration in behavioral performance in a retest12. Similar post-training manipulation paradigms have been widely used to behaviorally operationalize wakeful consolidation. Alternatively, wakeful consolidation can be operationalized through neural reactivation of learned materials observed after training, as reflected in enhanced BOLD (blood-oxygen-level-dependent) signals and increased BOLD decoding accuracy in visual cortical areas following visual training16,17. Additionally, it is worth noting that two studies have observed changes in the ratio between the primary excitatory (glutamate) and inhibitory (γ-aminobutyric acid, GABA) neurotransmitters in the visual cortex following training, suggesting that the excitation/inhibition (E/I) balance may also undergo changes during wakeful consolidation18,19.

The changes in neural excitation during wakeful consolidation hold promise as a potential avenue for modulating newly acquired learning traces. Recent studies have explored the effects of non-invasive electrical/magnetic brain stimulation on modulating the wakeful consolidation of visual learning. Low-frequency repetitive and theta-burst transcranial magnetic stimulations (TMS), both of which reduce neural excitability20,21, have been shown to disrupt the wakeful consolidation when applied after training14,22. Conversely, anodal transcranial direct current stimulation (a-tDCS), which increases neural excitability23,24, has been reported to facilitate wakeful consolidation25. Together, these findings suggest that modulating cortical excitation may causally influence wakeful consolidation.

In addition to electrical and magnetic stimulation, repetitive visual stimulation (RVS) at various frequencies (delta26, theta27, alpha28,29, beta30, and gamma31,32) is emerging as a promising tool for modulating perceptual and cognitive functions27, electrophysiological activity29, cerebral hemodynamics30, and even symptoms of Alzheimer’s disease32. Among these, alpha-frequency (8–12 Hz) RVS has garnered substantial evidence for its effects on enhancing early ERP (event-related potential) components33,34,35 and BOLD responses in the visual cortex36. These effects have been linked to glutamatergic and GABAergic activities (i.e., neural excitation)34,36 and are commonly recognized as markers of long-term potentiation-like (LTP-like) plasticity33,37. By contrast, 1-Hz RVS has recently been reported to induce long-term depression-like (LTD-like) plasticity in cognitive performance38,39, supporting the frequency-dependent nature of RVS-induced plasticity.

Here, we aim to investigate two unexplored but important questions: (1) Can photic stimulation (e.g., RVS), in addition to behavioral interventions and electrical/magnetic stimulations, be used to modulate wakeful consolidation? (2) What are the neural underpinnings giving rise to the modulations? To this end, we employ an RVS paradigm across five experiments to examine the impacts of RVS on wakeful consolidation. Wakeful consolidation is operationally defined as the immediate post-training neural processes. Interventions of the processes can modulate behavioral performance in a subsequent test, in line with the post-training manipulation paradigms. Specifically, participants receive RVS at different frequencies immediately after visual training, and we collect neuroimaging data of multiple modalities to assess the corresponding neural effects. Our findings reveal previously unknown and frequency-dependent impacts of RVS on wakeful consolidation, as well as on ERPs and neurotransmitter concentrations.

Results

RVS modulates wakeful consolidation in a frequency-dependent manner

Our primary objective was to determine whether, and if so, how photic stimulation can alter visual performance through modulating wakeful consolidation. Previous studies have shown that alpha-frequency photic stimulation induces LTP-like plasticity in the human visual cortex33,34,37 (see review40). By contrast, 1-Hz RVS was posited to induce LTD-like plasticity38,39. We therefore investigated the frequency effect of RVS by adopting three different frequency conditions: 10, 1, and 0 Hz (static control). Here in Experiment 1, we employed an orientation discrimination task (ODT) for the visual learning paradigm41. On each trial, two noise-embedded gabors with slightly different orientations appeared successively at a fixed position—either the lower-left or lower-right quadrant of the visual field—which was assigned per participant and remained constant throughout the experiment. Participants reported whether the second stimulus was rotated clockwise or counterclockwise relative to the first (Fig. 1a; see “Orientation discrimination task” in “Methods”).

Fig. 1: Experiments 1 to 3: Psychophysical measurements.
Fig. 1: Experiments 1 to 3: Psychophysical measurements.
Full size image

a Behavioral setup. Participants maintained central fixation while seated with their heads stabilized by a chin rest. During the ODT (orientation discrimination task), participants discriminated the orientation of peripheral gabors and responded using a keyboard. Eye movements were monitored with an eye tracker positioned in front of the participants. b Experimental procedure. Participants underwent the ODT training, RVS (repetitive visual stimulation), rest, and the ODT test. The three RVS conditions were 0, 10, and 1 Hz. c Learning curves in Experiment 1. All three groups exhibited training-induced improvement, indicated by the descending discrimination thresholds. Shaded areas indicate the data included in the ANOVA. Error bars are SEMs. N (number of participants) = 70. d One-sample t-tests were performed for each group to compare learning rates against zero (two-sided, Bonferroni-adjusted: 0-Hz: t(22) = 4.281, adjusted p < 0.001, Cohen’s d = 0.893, 95% CI = [0.400 1.371]; 10-Hz: t(24) = 3.627, adjusted p = 0.003, Cohen’s d = 0.725, 95% CI = [0.277 1.161]; 1-Hz: t(21) = 2.509, adjusted p = 0.020, Cohen’s d = 0.535, 95% CI = [0.082 0.977]). All groups demonstrated positive learning rates during training, consistent with performance improvement. Error bars are SEMs. N = 70. e In Experiment 1, simple main effects analyses from the repeated-measures ANOVA were conducted for each group to compare pre- and post-stimulation performance (two-sided, Bonferroni-adjusted: 10-Hz: F(1,67) = 11.519, p = 0.001, partial η2 = 0.147, 95% CI = [0.026 0.305]; 1-Hz: F(1,67) = 8.434, p = 0.005, partial η2 = 0.112, 95% CI = [0.011 0.265]; 0-Hz: F(1,67) = 1.713, p = 0.195, partial η2 = 0.025, 95% CI = [0.000 0.138]). Discrimination thresholds decreased after the 10-Hz RVS, increased after the 1-Hz RVS, and had no significant change after the 0-Hz RVS. Filled circles, violin plots, and black horizontal lines represent individual data points, kernel density estimates, and group means. N = 70. f Discrimination thresholds in Experiment 2. Error bars are SEMs. N = 70. g No significant group difference was found among untrained participants in Experiment 2. N = 70. h Learning curves in Experiment 3. Error bars are SEMs. N = 40. i Discrimination thresholds were unaffected by either the 10-Hz or 1-Hz RVS in the “uniform-disk” experiment. N = 40. *p < 0.05, **p < 0.01, ***p < 0.001. Source data are provided as a Source Data file.

Seventy participants sat in a behavioral test room and underwent the ODT training for 10 blocks (~30 min), followed immediately by 5 blocks (~25 min) of RVS with high-SNR (signal-to-noise ratio) and high-contrast gratings flickering at different temporal frequencies. The flickering gratings were presented at the same spatial location and with nearly the same orientation as the gabors used in the ODT. Participants were randomly assigned to three groups: 0 Hz (N = 23), 10 Hz (N = 25), and 1 Hz (N = 22) (see “Repetitive visual stimulation” in “Methods”). Following RVS, participants underwent a 1.5-h sleep-free break before completing a post-RVS test consisting of 6 ODT blocks (~20 min). Discrimination thresholds were calculated for each block and then analyzed using a repeated-measure ANOVA with a Session (pre- vs. post-RVS) × Group (0, 10, 1 Hz) design (see “Behavioral measurements” in “Methods”). The “pre” session was defined as the last 6 blocks in the training, and the “post” session as the 6 blocks in the test (Fig. 1b).

Participants’ performance improved gradually over the course of training (Fig. 1c). A paired t-test comparing the 1st and 10th training blocks revealed a significant reduction in discrimination threshold (t(68) = 8.996, p < 0.001, Cohen’s d = 1.083, 95% CI = [0.783 1.378]; one participant exceeding three standard deviations from the mean was excluded). Individual learning curves were then fitted to a power function to estimate learning rates. All groups showed significantly above-zero learning rates (one-sample t-test, Bonferroni-adjusted: 0-Hz: t(22) = 4.281, adjusted p < 0.001, Cohen’s d = 0.893, 95% CI = [0.400 1.371]; 10-Hz: t(24) = 3.627, adjusted p = 0.003, Cohen’s d = 0.725, 95% CI = [0.277 1.161]; 1-Hz: t(21) = 2.509, adjusted p = 0.020, Cohen’s d = 0.535, 95% CI = [0.082 0.977]) (Fig. 1d; See “Supplementary Methods” for details).

We next examined the effects of Group and Session on discrimination thresholds. The repeated-measure ANOVA revealed a significant Group × Session interaction (F(2,67) = 10.772, p < 0.001, partial η2 = 0.243, 95% CI = [0.142 0.569]), with no significant main effect of Group (F(2,67) = 1.368, p = 0.262, partial η2 = 0.039, 95% CI = [0.000 0.258]) or Session (F(1,56) = 0.353, p = 0.554, partial η2 = 0.005, 95% CI = [0.000 0.103]). Simple main effect analyses revealed that participants showed performance improvement after the 10-Hz RVS (F(1,67) = 11.519, p = 0.001, partial η2 = 0.147, 95% CI = [0.026 0.305]), performance deterioration after the 1-Hz RVS (F(1,67) = 8.434, p = 0.005, partial η2 = 0.112, 95% CI = [0.011 0.265]), and no significant change after the 0-Hz RVS (F(1,67) = 1.713, p = 0.195, partial η2 = 0.025, 95% CI = [0.000 0.138])(Fig. 1e). These results suggest that the RVS at 10 and 1 Hz exerted opposite effects on wakeful consolidation, consequently resulting in improvement or deterioration in visual performance, respectively.

RVS alone fails to modulate visual performance

In Experiment 2, we sought to answer the question: does RVS have to be implemented upon wakeful consolidation to modulate visual performance, or can RVS, on its own, induce changes in subsequent visual performance? The procedure of Experiment 2 was identical to that of Experiment 1 except the ODT training was removed. Seventy newly recruited, ODT-naïve participants were randomly assigned to the three groups: 0-Hz (N = 23), 10-Hz (N = 26), and 1-Hz (N = 21). Participants underwent 25 min of RVS, followed by a 1.5-h sleep-free break and then a 20-min ODT test. A one-way ANOVA revealed no group difference in discrimination threshold during the test (F(2,67) = 0.061, p = 0.940, partial η2 = 0.002, 95% CI = [0.000 0.050]; Fig. 1f, g). To further assess subtle group differences, we fitted individual data in the test with a power function and compared the learning rates across groups. No significant between-group differences in learning rate were found (one-way ANOVA: F(2,65) = 0.922, p = 0.403, partial η2 = 0.028, 95% CI = [0.000 0.226]; two participants exceeding three standard deviations from the mean were excluded). Each group showed moderate improvement relative to zero (one-sample t-tests, Bonferroni-adjusted: 0-Hz: t(22) = 3.010, adjusted p = 0.018, Cohen’s d = 0.628, 95% CI = [0.173 1.070]; 10-Hz: t(23) = 2.508, adjusted p = 0.060, Cohen’s d = 0.512, 95% CI = [0.081 0.933]; 1-Hz: t(20) = 3.188, adjusted p = 0.015, Cohen’s d = 0.696, 95% CI = [0.210 1.167]; see “Supplementary Methods” for details; two participants exceeding three standard deviations from the mean were excluded). Together, these results demonstrated that the RVS per se had minimal impact on visual performance and needed to be implemented after training to exert an effect.

Flickering uniform-disk yields no effect on wakeful consolidation

Experiment 3 examined whether the modulatory effects observed in Experiment 1 depended on the spatial content of the flickers, and whether the 10-Hz facilitation effect could be explained by potential attentional effects of alpha entrainment. To test this, we replaced the grating flickers with black-and-white, uniform-luminance disk flickers, which lack orientation information but are known to entrain alpha oscillation in the visual cortex29. Uniform-disk and grating stimuli were matched in size, average luminance, and temporal contrast. The procedure followed that of Experiment 1: participants underwent the training, RVS, a 1.5-h break, and the test. A repeated-measures ANOVA was performed as in Experiment 1. There were no significant effects of Session (F(1,37) = 0.432, p = 0.515, partial η2 = 0.012, 95% CI = [0.000 0.155]), Group (F(1,37) = 0.0003, p = 0.986, partial η2 = 0.000008, 95% CI = [0.000 0.000]), or the interaction (F(1,37) = 0.544, p = 0.465, partial η2 = 0.014, 95% CI = [0.000 0.000]). Simple main effects analyses showed that the 10-Hz uniform-disk RVS did not reduce the discrimination threshold (F(1,37) = 0.003 p = 0.954, partial η2 = 0.00008, 95% CI = [0.000 0.029]), nor did the 1-Hz uniform-disk RVS increase the threshold (F(1,37) = 0.949, p = 0.336, partial η2 = 0.025, 95% CI = [0.000 0.189]; Fig. 1h, i). These findings suggest that the RVS effects on wakeful consolidation were dependent on the spatial content of the flickers, rather than flickering alone.

10-Hz RVS strengthens the ERPs associated with ODT

Building on the behavioral findings, the following experiments aimed to investigate their underlying neural mechanisms. Previous RVS studies without visual training have demonstrated that flickers at ~10 Hz induced LTP-like changes in ERPs33,34. However, these ERPs were measured during passive viewing and lacked cognitive/behavioral relevance. It remains unclear whether the 10-Hz RVS also modulates ERPs associated with an active task. Therefore, in Experiment 4, we investigated whether the behavioral improvement/impairment induced by the 10-/1-Hz grating RVS was accompanied by corresponding changes in ERPs—i.e., LTP- or LTD-like plasticity—elicited by the ODT stimuli. Forty-five participants were assigned to three groups: 0-Hz (N = 15), 10-Hz (N = 15), and 1-Hz (N = 15). Electroencephalogram (EEG) was recorded throughout the training, RVS, and the test (Fig. 2a). The procedure mirrored that of Experiment 1 except that the break between RVS and the test was extended to 4 h. Careful optimization of stimulus parameters was undertaken to enhance ERPs, given their sensitivity to the physical characteristics of visual stimuli42 (see Table 1 and “Orientation discrimination task” in “Methods” for details).

Fig. 2: Experiment 4: 10-Hz RVS strengthens the ERPs.
Fig. 2: Experiment 4: 10-Hz RVS strengthens the ERPs.
Full size image

a EEG (electroencephalogram) and VEOG (vertical electrooculogram) signals were recorded during the ODT (orientation discrimination task) using a 64-channel cap. EEG signals were referenced at FCz and grounded at AFz (blue). Electrodes P5 and P6 were selected for ERP (event-related potential) analyses (green). Depicted are data from a representative participant. Dashed and solid curves are ERPs from the ipsilateral and contralateral hemispheres, respectively. b Learning curves in Experiment 4. Error bars are SEMs. N (number of participants) = 45. c Simple main effects analyses from the repeated-measures ANOVA were conducted for each group to compare pre- and post-stimulation performance (two-sided, Bonferroni-adjusted: 10-Hz: F(1,41) = 6.036, p = 0.018, partial η2 = 0.128, 95% CI = [0.003 0.328]; 1-Hz (F(1,41) = 12.790, p = 0.001, partial η2 = 0.238, 95% CI = [0.049 0.439]; 0-Hz: (F(1,41) = 3.426, p = 0.071, partial η2 = 0.077, 95% CI = [0.000 0.265]). Discrimination thresholds decreased after the 10-Hz RVS, increased after the 1-Hz RVS, and remained unchanged after the 0-Hz RVS. Filled circles are individual participants, violins are kernel density estimates, and black horizontal lines are means. N = 45. d ERP waveforms under 12 combinations (2 hemispheres × 3 groups × 2 sessions). Upper and lower panels show ERPs from the ipsilateral (IPS) and contralateral (CON) electrodes. Vertical lines at 0 ms indicate stimulus onset. ERPs exhibited an enhanced negative peak at around 160 or 190 ms at the CON and IPS electrodes after the 10-Hz RVS. Error bands are SEMs. N = 45. e Simple main effects analyses from the repeated-measures ANOVA were conducted for each group to compare pre- and post-stimulation N1 amplitude (two-sided, Bonferroni-adjusted: 10-Hz: F(1,42) = 12.063, p = 0.001, partial η2 = 0.223, 95% CI = [0.042 0.423]; 1-Hz: F(1,42) = 0.879, p = 0.354, partial η2 = 0.021, 95% CI = [0.000 0.167]; 0-Hz: F(1,42) = 0.168, p = 0.684, partial η2 = 0.004, 95% CI = [0.000 0.113]). The N1 amplitude was enhanced only by the 10-Hz RVS. N = 45. f In the 10-Hz group, Pearson correlation was performed between changes in behavioral performance and changes in N1 amplitude (two-sided: r = −0.542, p = 0.037, 95% CI = [−0.723 −0.292]). The center line shows the best-fit linear regression for the Pearson correlation. Shaded areas represent 95% confidence intervals (CIs). N = 15. *p < 0.05, **p < 0.01. Source data are provided as a Source Data file.

Table 1 Differences in experimental procedure and stimulus design across experiments

Experiment 4 replicated the behavioral findings in Experiment 1. There were no significant main effects of Session (F(1,41) = 3.000, p = 0.091, partial η2 = 0.068, 95% CI = [0.000 0.252]) or Group (F(2,41) = 0.774, p = 0.468, partial η2 = 0.036, 95% CI = [0.000 0.298]; one participant exceeding three standard deviations from the mean was excluded). A significant interaction between Session and Group was observed (F(2,41) = 9.685, p < 0.001, partial η2 = 0.321, 95% CI = [0.167 0.671]). Simple main effect analyses (Bonferroni-adjusted) revealed frequency-dependent modulation of wakeful consolidation by RVS: participants exhibited decreased, increased, and unchanged discrimination thresholds after the 10-Hz (F(1,41) = 6.036, p = 0.018, partial η2 = 0.128, 95% CI = [0.003 0.328]), 1-Hz (F(1,41) = 12.790, p = 0.001, partial η2 = 0.238, 95% CI = [0.049 0.439]), and 0-Hz (F(1,41) = 3.426, p = 0.071, partial η2 = 0.077, 95% CI = [0.000 0.265]) RVS, respectively (Fig. 2b, c). These results further validated the modulatory effects of the 10-Hz and 1-Hz RVS that we demonstrated in Experiment 1, despite slight modifications in stimuli and procedure. Moreover, the RVS effects on wakeful consolidation remained robust till even 4 h post-RVS.

For the ERP analysis, we first extracted 350-ms EEG epochs time-locked to the onset of the first grating in each ODT trial, which were then baseline-corrected. To obtain the ERPs, EEG epochs were averaged separately for ipsilateral and contralateral electrodes, before and after RVS (Fig. 2d). P5 and P6 were labeled as “ipsilateral” or “contralateral” electrode based on the hemifield (left or right) in which the grating appeared (see “EEG” in “Methods”). An enhanced N1 component of the ERPs was observed following the 10-Hz RVS. Specifically, the ERPs showed a more negative response in the para-N1 time range at both the contralateral and the ipsilateral electrodes for the 10-Hz group (Bonferroni-adjusted). No ERP change was observed for the 0- or 1-Hz group. To quantify this effect, we extracted peak-to-peak N1 amplitudes from the contralateral electrode43 and conducted a repeated-measure ANOVA with a Session (pre and post) × Group (0, 10, and 1 Hz) design (Fig. 2e). We found a significant main effect of Session (F(1,42) = 7.747, p = 0.008, partial η2 = 0.156, 95% CI = [0.012 0.356]), a marginal Session × Group interaction (F(2,42) = 2.682, p = 0.080, partial η2 = 0.113, 95% CI = [0.000 0.452]), and no main effect of Group (F(2,42) = 0.609, p = 0.549, partial η2 = 0.028, 95% CI = [0.000 0.268]). Further analysis of simple main effects revealed that only the 10-Hz RVS increased the N1 amplitude (F(1,42) = 12.063, p = 0.001, partial η2 = 0.223, 95% CI = [0.042 0.423]), while no change was observed for the 1-Hz (F(1,42) = 0.879, p = 0.354, partial η2 = 0.021, 95% CI = [0.000 0.167]) or 0-Hz (F(1,42) = 0.168, p = 0.684, partial η2 = 0.004, 95% CI = [0.000 0.113]) condition. In the 10-Hz group, changes in N1 amplitude were significantly negatively correlated with changes in discrimination threshold (Pearson correlation, two-sided: r = −0.542, p = 0.037, 95% CI = [−0.723 −0.292]; Fig. 2f). For the ipsilateral electrode, on the contrary, no significant effects were observed (Supplementary Fig. 2). These findings demonstrated that the 10-Hz RVS enhanced the task-associated N1 component, which may contribute to the observed behavioral performance improvement and suggests an involvement of LTP-like plasticity.

10-Hz RVS shifts the excitatory-inhibitory balance in the visual cortex toward excitation

Both animal44,45 and human34,36 studies have well-documented the central roles of the excitatory neurotransmitter glutamate and the inhibitory neurotransmitter GABA in visual plasticity. A delicate balance of excitation and inhibition, known as the glutamate/GABA equilibrium, is required for cortical plasticity to take place46. In order to gain a deeper understanding of the neural underpinnings of the RVS effects, we used proton magnetic resonance spectroscopy (MRS) to measure the relative concentration of glutamate and GABA during the training and the test.

In Experiment 5, 45 participants were evenly allocated to three groups: 0, 10, and 1 Hz (N = 15 each). Participants completed 30 min of the ODT training, followed by 25 min of RVS, a 4-h sleep-free break, and a 20-min test. While the procedure followed that of Experiments 1, 3, and 4, the visual stimuli were adapted to better fit the requirements of MRS scanning (see “Procedures and stimuli” in “Methods”). MRS was acquired using MEGA PRESS47, and scanning spanned both the training and the test. The MRS voxel had a field of view (FOV) of 2 × 2.5 × 2.5 cm3, centered on a functionally defined region of interest (ROI) in visual cortex identified using fMRI responses to a peripheral checkerboard (see “fMRI and MRS” in “Methods”). The excitatory-inhibitory neurotransmitter balance, referred to as the E/I ratio, was quantified as the ratio of Glx (glutamate and glutamine) to GABA within each participant’s FOV (Fig. 3a, b).

Fig. 3: Experiment 5: 10-Hz RVS increases neural excitation in the visual cortex.
Fig. 3: Experiment 5: 10-Hz RVS increases neural excitation in the visual cortex.
Full size image

a Setup of the MRS (magnetic resonance spectroscopy) experiment. Participants viewed the screen via mirror reflection from inside the scanner. MRS data were acquired during the ODT (orientation discrimination task) and preprocessed to extract individual spectra. Depicted spectrum was data from a representative participant. b FOVs (fields of view) employed during MRS acquisition. FOVs were placed based on the ROI (region of interest) localizer results to cover early visual cortical areas activated by ODT stimuli. Participants were randomly assigned to be trained in either the lower left or right visual field (VF). Heatmap color is the probability of a voxel being included in the FOVs. c Learning curves in Experiment 5. Error bars are SEMs. N (number of participants) = 45. d Simple main effects analyses from the repeated-measures ANOVA were conducted for each group to compare pre- and post-stimulation performance (two-sided, Bonferroni-adjusted: 10-Hz: F(1,42) = 11.753, p = 0.001, partial η2 = 0.219, 95% CI = [0.040 0.419]; 1-Hz: F(1,42) = 0.441, p = 0.510, partial η2 = 0.010, 95% CI = [0.000 0.140]; 0-Hz: F(1,42) = 0.728, p = 0.399, partial η2 = 0.017, 95% CI = [0.000 0.159]). Discrimination thresholds decreased following 10-Hz RVS (repetitive visual stimulation), with no significant change in the 1-Hz and 0-Hz groups. Filled circles and violins represent individual participants and the probability densities. Black horizontal lines indicate the means. N = 45. e Batch-wise changes in E/I ratios were analyzed using LME (linear mixed effect) models. Left panel shows individual data in all batches. The shaded area represents the interval between the training and test (RVS + rest). Upper right panel shows the data in the five batches in the training, averaged across all subjects (two-sided ANOVA: F(1,114) = 9.385, p = 0.003, partial η2 = 0.076, 95% CI = [0.010 0.183]). Lower right panel shows the group-averaged data in pre (batch #5 in the training) and post (batch #1 in the test) (simple main effect analysis of ANOVA, two-sided, Bonferroni-adjusted: 10-Hz: F(1,37.966) = 4.851, p = 0.034, partial η2 = 0.113, 95% CI = [0.000 0.318]; 1-Hz: F(1,37.966) = 0.365, p = 0.549, partial η2 = 0.010, 95% CI = [0.000 0.146]; 0-Hz: F(1,39.358) = 0.017, p = 0.897, partial η2 = 0.0004, 95% CI = [0.000 0.068]). Filled circles are individuals; error bars are SEMs. N = 42. n.s. non-significant, *p < 0.05, **p < 0.01. Source data are provided as a Source Data file.

Similar to Experiments 1 and 4, participants in Experiment 5 also showed a performance gain in the ODT after the 10-Hz RVS, despite slight stimulus differences. A significant Group × Session interaction was found (repeated-measure ANOVA: F(2,42) = 5.852, p = 0.006, partial η2 = 0.218, 95% CI = [0.048 0.580]). There was no main effect of either Group (F(2,42) = 0.173, p = 0.842, partial η2 = 0.008, 95% CI = [0.000 0.161]) or Session (F(1,42) = 1.217, p = 0.276, partial η2 = 0.028, 95% CI = [0.000 0.183]). Examination of simple main effects of session showed significant improvement in the 10-Hz group (F(1,42) = 11.753, p = 0.001, partial η2 = 0.219, 95% CI = [0.040 0.419]), but not the 1-Hz (F(1,42) = 0.441, p = 0.510, partial η2 = 0.010, 95% CI = [0.000 0.140]) or the 0-Hz (F(1,42) = 0.728, p = 0.399, partial η2 = 0.017, 95% CI = [0.000 0.159]) group (Fig. 3c, d). One possible explanation for the absence of performance deterioration in the 1-Hz group could lie in the experimental environment (noise and vibration in the scanner, nervousness, etc.), which may have already impaired performance in the pre-test and reduced the detectability of further performance deterioration in the post-test.

Analysis of MRS data obtained within the FOV revealed that training on the ODT, which demands fine discrimination, decreased the E/I ratio in the visual cortex. This is consistent with Frangou and colleagues’ finding48. Similar to their study, we constructed a linear mixed effects (LME) model, but additionally employed a sliding window (batch) approach and incorporated the batch-wise changes in the E/I ratio during training (normalized to the first batch) into the LME model. Our LME model (Group × Batch) showed that E/I levels in the visual cortex were comparable across groups prior to RVS (F(2,114) = 0.114, p = 0.893, partial η2 = 0.002, 95% CI = [0.000 0.051]). Interestingly, the model also revealed a significant decrease in the E/I ratio during training (F(1,114) = 9.385, p = 0.003, partial η2 = 0.076, 95% CI = [0.010 0.183]; Fig. 3e; see Supplementary Fig. 6 for separate quantifications of Glx and GABA). This suggests that training on fine orientation discrimination induced a decrease in visual cortical excitation, potentially supporting the training-induced behavioral improvement.

Critically, we found that the 10-Hz RVS shifted the E/I balance toward excitation when implemented immediately after the training. To assess this, we used a second LME model (Group × Session) to analyze the normalized E/I ratios of each group in each session. Sessions here were defined as pre (the last batch, i.e., last 6 blocks, in the training) and post (the batch, i.e., 6 blocks, in the test). Bonferroni-adjusted simple main effects showed a significant increase in the E/I ratio after the 10-Hz RVS (F(1,37.966) = 4.851, p = 0.034, partial η2 = 0.113, 95% CI = [0.000 0.318]) but not the 1-Hz (F(1,37.966) = 0.365, p = 0.549, partial η2 = 0.010, 95% CI = [0.000 0.146]) or 0-Hz (F(1,39.358) = 0.017, p = 0.897, partial η2 = 0.0004, 95% CI = [0.000 0.068]) RVS. However, Pearson correlation analysis showed no significant correlation between the change in E/I ratio and the change in discrimination threshold in the 10-Hz group (Pearson correlation, two-sided: r = −0.103, p = 0.726, 95% CI = [0.000 0.068]). These findings show that the increase in cortical excitation—as indexed by the E/I ratio—co-occurred with the behavioral improvement after the 10-Hz RVS, though the two effects were not strongly coupled at the individual level.

The results here suggest that the plasticity resulting from the training and the plasticity induced by the 10-Hz RVS might operate through different neurochemical mechanisms. Specifically, continuous training on feature discrimination was associated with decreased visual cortical excitation, as opposed to the plasticity induced by the 10-Hz RVS, which was associated with increased visual cortical excitation.

Discussion

In this study, we reported a photic stimulation method for modulating the offline processing of encoded sensory information (i.e., wakeful consolidation). In Experiment 1, we demonstrated that following visual training, 10-Hz RVS facilitated wakeful consolidation, whereas 1-Hz RVS impaired it. Experiment 2 revealed that such RVS effects were not due to RVS per se but depended on its application at the post-training stage, i.e., wakeful consolidation. In Experiment 3, we showed that the 10-Hz facilitation effect was not solely attributable to alpha oscillation entrainment and was dependent on the spatial content of visual input. Experiment 4 revealed an enhanced N1 component after the 10-Hz RVS, suggesting LTP-like plasticity in circuits supporting orientation discrimination. Finally, Experiment 5 showed an increase in the glutamate/GABA (E/I) ratio after the 10-Hz RVS, providing neurochemical evidence for excitation-related plasticity. Notably, this study provides evidence for RVS-induced changes in neurotransmitter concentrations in humans, addressing a previously unexplored aspect of RVS effects. The function of wakeful consolidation in visual learning has long been overlooked. Our findings call into question the common belief that the wakeful offline stage is merely idle49, showing instead that this period is susceptible to frequency-dependent modulation and suggesting that it actively engages consolidation-related processes and plays a functionally significant role in shaping learning outcomes. Our findings also help to uncover the neural mechanisms through which 10-Hz RVS modulates consolidation, contributing to a more comprehensive understanding of visual learning.

Our study demonstrated the potential of photic stimulation to modulate the wakeful consolidation of visual learning. Compared to conventional methods such as electrical stimulation25, magnetic stimulation14, and drug interventions50, photic stimulation offers several distinct advantages. It is practical, user-friendly, and highly versatile, which enhances its feasibility and acceptance in various research and applied contexts. In research contexts, photic stimulation can be meticulously tailored to selectively target specific neurons, circuits, and pathways in the visual system51, offering a valuable means to probe the intricate neural mechanisms underlying diverse aspects of visual functions. In applied contexts, given that visual learning has well-recognized clinical relevance in visual rehabilitation (e.g., amblyopia52), our findings raise the possibility that RVS could be leveraged to enhance the effectiveness of such rehabilitation protocols. However, it is also important to note that since photic stimulation depends largely on stimulus configuration, this specificity may limit its generalizability compared to transcranial electrical or magnetic stimulations, which often produce broader effects.

Whereas transcranial magnetic/electrical stimulation typically modulates relatively superficial cortical circuits near the stimulation site in a region-specific manner, RVS, as a structured visual input, may engage broader visual pathways, e.g., the thalamocortical loops. In Experiment 1, we found that the modulatory effects of RVS on consolidation were frequency-dependent. The facilitation effect of the 10-Hz RVS may involve two mechanisms: (1) entrainment of alpha oscillations in the visual pathway enhancing attention—a possibility later challenged by Experiment 3; and (2) the 10-Hz RVS may closely match the intrinsic resonance properties of thalamocortical circuits involved in visual processing53 and thereby induce plasticity54—akin to the role of beta rhythms in auditory processing55 and auditory plasticity56. Low-frequency stimulation (1 Hz), on the other hand, may induce cortical depression57, potentially accounting for the impaired post-test performance.

Experiment 2 showed that the effects of RVS were contingent upon prior training, as evidenced by the absence of impact on “naïve” participants. Marzoll and colleagues reported that 10-Hz RVS enhanced performance in untrained individuals but impaired performance in those who had already completed seven days of visual training on an orientation detection task58. While this may seem inconsistent with our findings, it underscores the context-dependent nature of RVS effects. Well-trained, newly trained, and naïve participants likely represent distinct learning stages: well-trained individuals with saturated performance and stabilized neural substrates59 may possess different neurochemical attributes in their visual system compared to newly trained participants in the current study. In contrast, cortices naïve to ODT training may lack the neurobiological milieu necessary for RVS-induced modulations. Moreover, task differences (e.g., discrimination versus detection) have been linked to distinct shifts in neurotransmitter concentrations60 and may further influence how RVS interacts with visual circuits. Collectively, these findings suggest that RVS does not exert a uniform influence on performance, but rather interacts with ongoing brain states shaped by task demands, training history, and local circuit properties. Future studies are needed to systematically disentangle these factors.

Experiment 3 demonstrated that the uniform-disk RVS failed to modulate the wakeful consolidation, yielding two important insights. First, the modulatory effects were dependent on the presence of visual orientation, implying that the effects might originate (at least partially) from orientation processing stages (e.g., early visual areas). Second, the 10-Hz facilitation effect could not be simply explained by potential attentional effects of alpha entrainment (see Supplementary Figs. 35), as the 10-Hz uniform-disk RVS, which could also entrain alpha oscillations in the visual cortex29, did not facilitate the wakeful consolidation.

Experiments 4 and 5 replicated in general the frequency-dependent effects of RVS observed in Experiment 1. However, the timing of their emergence during the post-test phase varied: in Experiments 1 and 4, group differences appeared immediately, whereas in Experiment 5, they unfolded more gradually. This discrepancy may be attributed to differences in rest duration, stimulus parameters, or alertness. Although speculative, these factors may shape how RVS effects are expressed over time and merit further investigation in future studies.

We further demonstrated that the facilitation of wakeful consolidation by the 10-Hz grating RVS manifested as enhanced ERPs and elevated visual cortical excitation. The enhanced N1 component aligns with previous findings in RVS research, despite differences in stimulus and task design33,35. N1 enhancement is widely regarded as a non-invasive marker of LTP-like plasticity in the human visual system33,37, and has been specifically linked to visual discrimination (as in our study) rather than visual detection61. Thus, the enhanced N1 may reflect LTP-like plasticity in neural circuits supporting orientation discrimination. By contrast, the 1-Hz RVS, which led to impaired behavioral performance, did not elicit a reliable ERP change. One possible explanation is that LTD-like plasticity may not be indexed by early ERP components. Several studies have similarly reported behavioral impairment following 1-Hz RVS, while evidence for corresponding ERP correlates is lacking, highlighting the challenge in identifying robust LTD-related neural markers in the human visual system38,39. Another possibility is measurement limitations: 1-Hz visual stimulation is less likely to entrain neural activity in a time-locked manner and is more susceptible to physiological and environmental noise62, rendering its effect elusive with the current measurement.

Facilitated wakeful consolidation was also associated with an elevated E/I ratio in the visual cortex. At the phenomenological level, studies using transcranial electrical/magnetic stimulations have shown that excitatory (e.g., a-tDCS) or inhibitory (e.g., low-frequency rTMS) protocols administered after training can respectively enhance or disrupt wakeful consolidation14,25, highlighting the relevance of cortical excitability to wakeful consolidation. Our findings suggest that the 10-Hz RVS affected wakeful consolidation, possibly via upscaled cortical excitation, reflecting reinstatement of task-related neural activities instead of inhibitory stabilization. Mechanistically, previous studies have shown that RVS-induced LTP-like plasticity not only depends on glutamatergic transmission in the visual cortex34,36,63, but also increases glutamate signaling45,64. The observed increase in cortical excitation may therefore reflect LTP-like plasticity. Meanwhile, it is also important to note that increasing cortical excitation does not necessarily benefit behavioral performance. For instance, a-tDCS administered during a glass pattern feature discrimination task was found to yield no effect, while cathodal tDCS (c-tDCS) improved performance. But in a detection task using similar stimuli, performance was found improved by a-tDCS but unaffected after c-tDCS60. Stimulation outcomes likely depend on many factors, such as timing, task demands, and specific neuroplastic mechanism engaged.

Interestingly, contrary to the E/I ratio increase induced by the 10-Hz RVS, the E/I ratio decreased during the ODT training. There are two possible explanations for this phenomenon: (1) task repetition may lead to visual adaptation or overlearning, both associated with reduced cortical excitation18,65; (2) discrimination training may enhance inhibitory processing. Previous studies have shown that discrimination training sharpened feature representation, a process supported by parvalbumin-positive GABA interneurons66, and increased GABA concentration48. Regardless of the underlying cause, the decrease in E/I ratio induced by training may appear inconsistent with the increased E/I ratio induced by the 10-Hz RVS, though they had convergent behavioral effects. However, this apparent mismatch can be explained by that stimulation effects are shaped by multiple factors, not merely by alignment with training-induced neurometabolic changes. For example, previous studies have shown that in orientation discrimination tasks—where a decrease in E/I ratio is typically expected—excitatory a-tDCS still led to improved behavioral performance67,68. Meanwhile, it is important to note that the SNR of 3.0 T MRS imposes constraints on the length of the sliding window. Our fixed window length (6 blocks) might have been too coarse to capture potential non-monotonic or abrupt E/I fluctuations during training, if they exist. Nevertheless, the MRS findings suggest that training-induced plasticity and RVS-induced LTP-like plasticity may represent distinct forms of macroscopic neural plasticity—both behaviorally beneficial, yet neurochemically divergent.

In summary, the current study introduces an approach for modulating wakeful consolidation and sheds light on its neural substrates. Future research should explore how RVS affects wakeful consolidation of various kinds of visual learning, from low-level feature detection and discrimination to high-level shape and object recognition. Furthermore, this repetitive stimulation paradigm can be naturally applied to other types of sensory learning (e.g., auditory learning). All these efforts will help to contribute to our full understanding of neural mechanisms of repetitive sensory stimulation and wakeful consolidation.

Methods

Participants

A total of 270 participants met all screening criteria (see “Supplementary Methods” and Supplementary Fig. 1) and provided written informed consent forms prior to experiment commencement. Among these participants, 70 participated in Experiment 1 (mean age ± SD: 22.30 ± 3.12 years, 44 female), 70 in Experiment 2 (21.38 ± 2.47 years, 39 female), 40 in Experiment 3 (21.05 ± 2.30 years, 26 female), 45 in Experiment 4 (20.80 ± 2.15 years, 14 female), and 45 in Experiment 5 (20.67 ± 1.94 years, 24 female). Sex was self-reported during recruitment but was not considered in study design or analyses. No sex-based analyses were conducted because the study was not powered to detect sex differences, and no a priori hypotheses or theoretical framework pertained to sex as a moderating factor in the investigated mechanisms. However, data on sex are available in the Source Data file. All participants had normal or corrected-to-normal vision and no self-reported personal or family history of neurological or psychiatric disorders. All participants received monetary compensation for their involvement in the study. All experiments in this study were approved by the ethics committee of the School of Psychological and Cognitive Sciences, Peking University.

Procedures and stimuli

Experiments 2–5 were derivatives from Experiment 1. In Experiment 1, the objective was to examine whether RVS at different temporal frequencies could influence the wakeful consolidation of visual learning. Participants first provided informed consent and completed a screening questionnaire battery (see “Supplementary Methods”). Only participants who met all screening criteria proceeded to the main procedure, which consisted of 10 blocks of the ODT training, 5 blocks of RVS, a 1.5-h sleep-free break, and finally, 6 blocks of the ODT test. Each training and test block comprised 50 trials, with a mandatory rest period of at least 50 s between blocks. In every RVS block, we presented flickering stimuli at 10, 1, or 0 Hz for 5 min, and participants rested for at least 20 s between every two RVS blocks. This experiment employed a semi-double-blind design where both participants and a second experimenter conducting the post-RVS test were unaware of the experimental condition (i.e., RVS frequency).

During the experiment, participants were seated in a dimly lit room and instructed to maintain their fixation on a central dot during the ODT and RVS. The viewing distance was approximately 70 cm. Eye movements were monitored using an eye tracker (Eyelink 1000 Plus, SR Research, Ontario, Canada). During the break between RVS and the test, participants were instructed to refrain from sleeping or engaging in physical or mental exertion and were required to wear a wrist-mounted smart band (Honor Band 6, Honor Device, Shenzhen, China) to track their physical activity during the break.

Experiments 2–5 were conducted under a single-blind design. In Experiment 2, the training was omitted. In Experiment 3, grating stimuli in RVS were changed into round uniform disks, and only the 10-Hz and the 1-Hz groups were included. In Experiment 4, the rest between RVS and the test was prolonged to 4 h, and EEG was recorded throughout the training, RVS, and the test. In Experiment 5, the rest between RVS and the test was also set to 4 h. We acquired structural and functional MRI data for the first time before the ODT training and a second time before the ODT test. MRS data were acquired during the training and test. The viewing distance was increased to approximately 120 cm.

Visual stimuli used in the ODT, RVS, and fMRI ROI localizer in all five experiments were generated using Matlab and Psychtoolbox-369.

Orientation discrimination task

The visual stimuli were presented on either a Display + + monitor (Cambridge Research Systems Ltd., Kent, UK; used in Experiments 1–4) or an MR-compatible BOLDscreen monitor (same manufacturer; used in Experiment 5). Both screens had a refresh rate of 120 Hz and a resolution of 1920 × 1080 pixels, and were calibrated for gamma-correction to a linear 0-to-100 cd·m−2 before experiment commencement. A central fixation dot with a diameter of 0.17° was constantly visible against a gray background of half the maximum luminance of the display (50 cd·m−2). ODT stimuli were either Gaussian-enveloped sinusoidal gabors (Experiments 1, 2, 3, and 5) or non-enveloped sinusoidal gratings (Experiment 4), with orientations around either 26° or 154°, depending on the visual field where stimuli were presented. The stimulus parameters are detailed in Table 1.

Each trial began with the first stimulus being displayed for 100 ms, followed by a blank interval of 600 ms (jittered between 500 and 700 ms in Experiment 4), and then a second 100-ms stimulus with a slightly altered orientation at the exact location of the first stimulus. Participants indicated the direction of the rotation (clockwise or counterclockwise) with a keypress and were explicitly instructed not to respond hastily. After the response was given, there was a lapse between trials, which lasted 500 ms in Experiments 1, 2, 3, and 5 but varied between 500 and 700 ms in Experiment 4 (i.e., the EEG experiment). The varying durations of trial-to-trial lapse, together with the jittered between-stimulus intervals, were implemented to avoid EEG artifacts arising from the relatively close arrangement of stimuli and trials.

Each ODT block consisted of 50 trials, with the difficulty level (i.e., orientation difference) controlled via a QUEST staircase procedure to estimate participants’ discrimination threshold at 75% accuracy. No feedback was provided during the ODT training or test. However, during the practice round preceding training, participants received visual feedback—the central fixation dot turned green after each correct response—in order to facilitate practice. The practice round consisted of 25 trials and employed orientations (116° or 64°) different from those in the ODT training and test (26° or 154°).

Repetitive visual stimulation

RVS was administered immediately following the completion of the ODT training. The sinusoidal gratings used in RVS were noise-free and slightly larger in size than the gabors/gratings in the corresponding ODT, ensuring full coverage of the trained visual field. High similarity between the stimuli adopted in the ODT and in RVS was ensured as sensory LTP-like plasticity is known to be input-specific70. That is, gratings in RVS had the same spatial frequency and orientation as those in the ODT. Detailed stimulus parameters can be found in Table 1.

Each RVS block lasted 5 min, and we investigated the effects of RVS at different frequencies. In the 10-Hz RVS group, gratings were presented in 1-s flickering bursts (10 Hz phase-reversing), following every 5-s blank interval. Within each flickering second, a grating at phase φ was visible for 16.67 ms, followed by an 83.33-ms blank period. This was then followed by a second grating of phase φ + 180° for 16.67 ms and another 83.33-ms blank period. This sequence was repeated five times, hence 1 s.

In the 1-Hz RVS group, gratings continuously flickered at 1 Hz with phase reversal throughout each 5-min block. Specifically, each cycle consisted of a grating at phase φ shown for 16.67 ms, followed by a 983.33-ms blank, then a second grating at phase φ + 180° for 16.67 ms, followed by another 983.33-ms blank. The design of the 10- and 1-Hz RVS is similar to that described by Marzoll and colleagues58.

In the 0-Hz RVS (control) group, 15 static gratings were presented over the 25-min stimulation period, each displayed for 1.67 s. The intervals between every two gratings were randomized to eliminate frequency-specific temporal structure while maintaining a comparable amount of overall visual stimulation with the other two groups.

To ensure participants’ attention to the visual stimuli during RVS, we employed an oddball paradigm, in which participants were asked to detect occasional reductions in grating size. The diameters of the oddball stimuli ranged from 50% to 75% of the standard grating diameter. Their spatial frequency was scaled proportionally. After each RVS block, participants reported the number of smaller gratings they had detected in the block using a numeric keyboard. Participants who answered incorrectly two times out of five were excluded from further analysis.

In Experiment 3 (i.e., the “uniform-disk” experiment), gratings at phase φ and φ + 180° were replaced with white and black uniform-luminance disks of the same diameter.

fMRI localizer

The ROI localizer stimulus was presented on the BOLDscreen monitor, as mentioned earlier, and viewed from inside the MR scanner. The purpose of the localizer was to identify the early visual cortical regions responsive to the gabors in the ODT. To achieve this, a black-and-white checkerboard pattern (gird size: 0.44°) was reversed at a variable frequency of 3–5 Hz71. The choice of a variable reversal frequency allowed us to avoid potential interference from steady-state 4-Hz visual stimulation, which might otherwise overlap with our experimental frequencies of interest: 10, 1, and 0 Hz. The checkerboard subtended a circular area of 3° in diameter, centered 5° from the central fixation point. This location matched precisely the gabors in the ODT in Experiment 5. Each localizer cycle consisted of a 15-s checkerboard reversal followed by a 15-s blank screen, repeated 8 times for a total duration of 4 min. Participants were instructed to maintain central fixation throughout the localizer scan.

Data acquisition and analysis

Behavioral measurements

As for the ODT, the geometric mean of the threshold values obtained from the last 6 blocks in the training was termed “pre” and was compared to the geometric mean of the threshold values from the 6 blocks in the test, which was termed “post.” Participants whose pre or post performance deviated by three or more standard deviations from the group mean were excluded from further statistical analyses. We employed a 2 × 3 repeated-measure ANOVA with Session (pre, post) as a within-subject factor and Group (0, 10, and 1 Hz) as a between-subject factor to analyze the ODT thresholds in Experiments 1, 4, and 5. In Experiment 2 (the “naïve” experiment), a one-way ANOVA (between-subject factor: Group) was used. In Experiment 3 (the “uniform-disk” experiment), a 2 × 2 repeated-measure ANOVA was used to compare pre and post performance between the 10- and 1- Hz groups.

EEG

Experiment 4 (the EEG experiment) was conducted in a magnetically shielded room, and EEG data were collected using a 64-channel Ag/AgCl cap (Easycap, BrainProducts, Gilching, Germany) and the BrainVision Recorder system (BrainProducts, Gilching, Germany). Signals were continuously recorded at 500 Hz with an FCz reference and preprocessed using EEGLAB (version 2021.1). EEG signals were re-referenced to bilateral mastoids and bandpass filtered (1–12 Hz)72. Electrodes P5 and P6 were selected for epoch based on their maximal N1 amplitude over the contralateral hemisphere relative to stimulus presentation. Data from these two electrodes were then segmented into baseline-corrected epochs with a 50-ms pre-stimulus baseline and a 300-ms period following stimulus onset. To avoid neural response overlap between stimuli, only epochs corresponding to the first grating in each trial were included, as the ERPs induced by the second grating occurred before the response to the first grating had returned to baseline. Epochs with significant eye movements were removed (see “Supplementary Methods”). For each participant, ERPs were then averaged across the last two training blocks (pre) and the first two test blocks (post), separately for the electrode (either P5 or P6) contralateral (CON) or ipsilateral (IPS) to the visual field of stimulus presentation.

The contralateral hemisphere exhibited the N1 component before the ipsilateral hemisphere. In addition, decreasing stimulus size and increasing eccentricity from central fixation are both known to prolong ERP component latencies43. Accordingly, N1 was defined as the negative peak occurring between 138 and 192 ms for the CON electrode and 164–224 ms for the IPS electrode. The peak-to-peak amplitude of N1 at the CON electrode was extracted and analyzed using a 2 × 3 repeated-measure ANOVA (within-subject: Session; between-subject: Group).

fMRI and MRS

Structural, functional, and spectroscopic data were acquired using a 3-Tesla Prisma MRI scanner equipped with a Syngo MR XA30 system (Siemens, Forchheim, Germany), and a head-neck 64 coil. High-resolution T1-weighted anatomical images were obtained using the MPRAGE sequence with GRAPPA2 (TR = 2300 ms, TE = 2.98 ms, FOV = 232 × 256 mm2, voxel size = 1.0 × 1.0 × 1.0 mm3, 192 slices, acquisition time = 5:03 min). Functional BOLD images were acquired using an echo-planar imaging sequence (TR = 1500 ms, TE = 30 ms, FOV = 192 × 192 mm2, matrix = 96 × 96, thickness = 2 mm, 60 slices, acquisition time = 4:09 min).

BOLD signals were acquired during the ROI localizer using a block design. Two conditions (“on” and “off”; see “fMRI localizer” in “Methods”) were alternated in 15-s blocks to allow the BOLD signal to return to baseline each time after the response to the checkerboard was elicited. Functional volumes were aligned to the first image to correct for possible head motion and registered on individual anatomical scans. We used a two-category (on and off) general linear model (GLM) and a paired t-test in SPM1273 to identify the activated areas during the localizer. ROIs were defined as clusters of activation along the calcarine fissure in the hemisphere contralateral to the localizer stimulus and were subsequently used to guide FOV placement in MRS.

MRS was acquired using the MEGA PRESS sequence47 provided by the University of Minnesota under a C2P agreement (TR = 2000 ms, TE = 68 ms, FOV = 20 × 25 × 25 mm3, acquisition time was initialized as 1024 averages but varied according to participants’ task performance speed). B0 shimming was performed automatically using vendor-provided shim module and manually adjusted when necessary to ensure that the full width at half maximum (FWHM) of the water peak was below 15 Hz for all participants and sessions. Water suppression during MEGA PRESS was achieved using the VAPOR (variable power RF pulses with optimized relaxation delays) technique. Frequency-selective editing pulses were applied at 1.90 (“ON”) and 9.00 (“OFF”) part-per-million (ppm) to edit GABA resonance at 3 ppm and remove unmodulated background signals in the difference spectrum (ON–OFF). Finally, at the end of both the training and the test, two VAPOR-off water scans were acquired for water quantification.

Spectrum processing and quantification were performed using a customized version of the Gannet 3.2.1 toolbox74 in Matlab 2023a. Only MRS scans fully contained within ODT blocks were retained; scans partially or entirely outside these blocks were excluded. The remaining scans were then reorganized into blocks so that each ODT block corresponded to one set of MRS scans. We then adopted a sliding window (referred to as “batch”) with the length of 6 blocks (~ 12 min) and the step size of 1 block. The average spectrum was computed for each batch, yielding five spectra in the training and one spectrum in the test for each participant, which were then normalized against the first batch in training.

The concentrations of Glx and GABA were obtained by fitting difference-edited (subtracted) spectra at their respective chemical shift positions. Data points with fitting errors exceeding 10% were excluded from further analyses, leaving 14 participants in each group. The E/I ratio was calculated as Glx to GABA. Percentage changes in the E/I ratio relative to the first batch were analyzed using LME models. The first LME model aimed to test whether the E/I ratio decreased with training; therefore, it was designed as Change in E/I = covariate Batch (1 to 5) × Group (0, 10, and 1 Hz), with intercepts included. A second LME model examined the effects of both Session and Group with Change in E/I = Session (pre and post) × Group (0, 10, and 1 Hz), with intercepts included.

FOV geometries were transformed into a standard MNI coordinate system using individual structural data and registered onto an MNI152 standard brain image. With the acquired MR data, we calculated the average distance of individual FOVs from the mean Montreal Neurological Institute (MNI) coordinates. FOVs were then visualized using Freesurfer75. For participants trained in the lower left quadrant, FOV was in the right hemisphere (mean ± SEM: x = 16.96  ± 0.24 mm, y = −81.17 ± 0.56 mm, z = 8.00 ± 0.61 mm). For those trained in the lower right quadrant, FOV was in the left hemisphere (mean ± SEM: x = −13.39 ± 0.28 mm, y = −83.77 ± 0.43 mm, z = 6.55 ± 1.02 mm)76. Small SEMs indicate that the individual difference in FOV localization was small.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.