Introduction

Sleep is thought to provide the necessary conditions for memory consolidation1, facilitating the transfer of memory traces from temporary storage in the hippocampus to permanent storage in neocortical regions2. Evidence for sleep’s role in memory is supported by research ranging from place cell recordings in rats to fMRI studies in humans (see3,4 for reviews). Two classes of neural oscillations that take place in “non-rapid eye movement (NREM)” sleep, specifically stages N2 and N3, have been particularly implicated in memory processes: slow oscillations (SOs 0.5–1.5 Hz delta waves) and sleep spindles (11–16 Hz sigma waves). These EEG features correlate with behavioural performance in both humans5 and animals6,7.

According to the active systems consolidation theory8, the temporal synchrony between SOs and spindles is critical for effective memory consolidation9. Sleep spindles tend to align preferentially with SO up-states10 and better alignment predicts better consolidation. For example, Niknazar et al.11, demonstrated that enhanced SO-spindle coupling improved verbal recall compared to weaker coupling. Multiple studies across different age groups have since replicated this correlation between the synchrony of SO-spindle coupling and behavioural performance12,13,14, emphasizing the functional significance of this interaction in memory processes. However, not all models focus on coupling. An alternative perspective emphasizes the frequency and pattern of occurrence of spindles as being critical, particularly for procedural memory consolidation15,16. In either case, a large number of spindles and slow oscillations occur that are neither part of coupled complexes nor trains, raising questions about whether they still might have functional roles17. In sum, there are still many gaps in our understanding of the precise roles of sleep-specific neural oscillations in memory consolidation, and how they interact with different memory systems.

Given the various roles proposed for sleep oscillations, it is important to consider how they might differentially affect different types of memory. Declarative memory involves the conscious recall of information, such as lists of words, whereas procedural memory encompasses skills that can be performed without conscious awareness, once learned18,19,20. The demands of everyday life generally require the simultaneous or integrated use of various forms of memory, which we term ‘complex learning’. Ideally, the outcome of research on sleep and memory would generalize to these complex human learning experiences. However, to study memory consolidation in a laboratory setting, researchers have refined tasks to distinguish clearly between declarative and procedural memory functions – an approach that can help us to compare how different types of information are stored. Correlational studies trying to link sleep brain activity to behavioural performance suggest distinct roles in memory processing for slow oscillations and sleep spindles, with SOs preferentially influencing declarative memory21,22, and spindles impacting procedural memory23,24. However, these associations are not absolute, as there are exceptions: some studies have shown sleep spindles to be important in declarative memory25 and slow oscillations to be relevant to procedural memory26. Although each type of task involves some distinct brain regions, the brain systems involved play similar functional roles across domains, which depend on common anatomical, physiological, and biochemical substrates – suggesting they might be less separable than assumed by the declarative/procedural memory model27,28. A recent comprehensive review acknowledges that the role of sleep in human memory consolidation is still under considerable debate, and numerous contradictory and non-replicable findings have been reported29. Additionally, challenges to the notion of ‘sleep-dependent memory consolidation’ suggest that plasticity mechanisms critical for consolidation operate during both sleep and waking states, positioning sleep along a continuum of behavioural states rather than as uniquely necessary for memory formation30. These perspectives may help explain why sleep-memory benefits are not consistently observed across studies.

Despite these inconsistent findings, there is evidence that tasks considered to selectively target declarative and procedural memory systems both benefit from sleep21,31, and that the process of consolidation of both subtypes of memory seems to follow a similar process, as sensory information captured during learning transitions from short-term memory into more stable long-term memory formats32. The specific, and potentially different, roles of each sleep oscillation (either separately or temporally synchronized), therefore still remains unclear. Developing tools to experimentally manipulate them is necessary to clarify the mechanisms underlying memory consolidation.

To causally investigate these mechanisms, researchers have developed techniques to experimentally manipulate sleep oscillations. In 2013, Ngo et al. demonstrated that slow oscillations can be modulated by targeting their up-states using a technique known as Closed-Loop Auditory Stimulation (CLAS), which resulted in evoked slow wave and spindle band activity, and an improvement in overnight memory consolidation on a declarative memory task33. This approach (SO-CLAS) has since been used extensively (for recent reviews, see34,35). Although all of these studies did observe enhanced slow wave activity following CLAS, only a subset supported the causal link for the relationship between SO activity and memory consolidation by enhancing performance on declarative memory tasks33,36,37,38 (see Supplementary Fig. 1 for a summary of CLAS study designs and findings). The discrepancies in behavioural outcomes in both declarative39,40,41,42,43 and procedural paradigms44 suggest that the causal relationship between neurophysiological and behavioural responses to CLAS is not straightforward.

SO-CLAS research efforts have been primarily focusing on consolidation of simple tasks (e.g., word-pair memorization, simple arithmetic, serial motor reaction time, and motor sequence tasks23,37,45). Research of the benefit of CLAS on complex tasks has been limited, but suggests potential influences. Shimizu et al.46, found that SO-CLAS improves performance on a navigation task, showing the feasibility of real-world applications.

A similar CLAS approach on sleep spindles (SP-CLAS) could provide complementary information about their roles. However, CLAS directly targeting spindles has been technically challenging due to their short duration and high inter- and intra-subject variability, with previous closed-loop attempts achieving only 20–23% successful targeting23,47,48. Through interdisciplinary collaboration, our lab has developed a device capable of real-time detection and stimulation of sleep spindles, the ‘Portiloop’49,50. The capacity of SP-CLAS to modulate neural activity has been demonstrated in a multi-night within-subject design51. Additionally, recent research has challenged the previously established hypothesis regarding the sensory-blocking role of sleep spindles, showing that auditory information continues to reach the cortex even during spindles and their refractory periods52. Targeted manipulation of both SO and SP through CLAS, while measuring their effects on procedural, declarative, and complex learning, may elucidate their specific roles in memory consolidation. The Portiloop is capable of achieving high-precision real-time detection and stimulation of sleep spindles49, with 97.6% successful targeting before spindle offset51, enabling the systematic investigation of timing-dependent effects within individual spindles on memory consolidation. Comparing the effects of closed-loop auditory stimulation of both SO and SP in a single study paradigm may therefore provide information about their contribution to memory consolidation.

In the present study, we investigated the neurophysiological and behavioural outcomes of SO-CLAS and SP-CLAS (using two approaches: immediate and delayed stimulation of spindles) on different memory systems in a between-subjects nap design (see Fig. 1). We included both immediate and delayed (450 ms post-detection) spindle stimulation conditions because previous attempts at spindle-targeted CLAS predominantly stimulated spindles at their tail ends or after offset47, and recent work demonstrated that stimulation timing within the spindle (first versus second half) produces different neurophysiological effects51. The present study includes both confirmatory hypotheses (replicating established physiological effects and testing predicted memory system differences) and exploratory analyses (examining individual differences in stimulation effectiveness and effects on complex learning). Subjects were randomly assigned to one of the experimental conditions (including unstimulated sleep and wake control) for a 2 hr nap opportunity. Evoked brain activity and change in performance on declarative, procedural, and complex tasks were compared across conditions. We present findings demonstrating successful neurophysiological modulation alongside complex patterns of behavioural change across memory systems.

Fig. 1: Study design.
Fig. 1: Study design.
Full size image

a Participants learned and were tested on three behavioural tasks (counterbalanced order) before being randomly assigned to a manipulation condition (N = ~20 per condition). After a 2-hour wake or sleep period, they were tested on the task again and changes in performance were computed. GLT Grid Location Task, MSL Motor Sequence Learning task, Piano: complex learning task. b Schematic representation of the electrode placements and site of detection for each condition. In the slow oscillation stimulation condition (SO), brain oscillations were detected at Fpz while in the spindle stimulation and delayed spindle stimulation conditions (SP, SPd), brain oscillations were detected at Cz. LM: left mastoid (reference). c In the SO condition the auditory stimulation (15 ms pink noise) was sent at the SO up-state. d In the SP condition the auditory stimulation (15 ms pink noise) was sent at spindle detection while in the SPd condition, the stimulation was sent 450 ms after detection.

Results

Homogeneity across groups

We first conducted several statistical tests to ensure homogeneity of demographics and alertness across the groups, before experimental manipulation. A one-way ANOVA revealed no significant differences in participants’ age across groups (F(4, 97) = 0.96, p = 0.44). Sex distribution was similarly balanced, confirmed by a Chi-square test of independence (\({{\rm{X}}}_{F}^{2}(8)=6.23\), p = 0.61). Participant alertness, as measured by their performance on the Psychomotor Vigilance Task (PVT) before the learning phase, showed no significant differences between the six conditions as indicated by a one-way ANOVA (F(4, 97) = 1.03, p = 0.39). Regarding sleep parameters, across all sleep conditions, participants averaged 71.4 minutes of sleep (SD: 21.8; see Supplementary Table 1 for complete results). To assess whether auditory stimulation affected sleep quality, we compared sleep duration (N2 and N3 combined) in each stimulation condition to the unstimulated Sleep condition using independent sample Student’s t-tests. No significant differences emerged between any of the stimulated conditions and the control Sleep condition (SO vs Sleep: (t(38) = -0.69, p = 0.49, Cohen’s d = -0.22); SP vs Sleep: (t(40) = 0.73, p = 0.47, Cohen’s d = 0.23); SPd vs Sleep: (t(38) = –1.82, p = 0.08, Cohen’s d = -0.58)). To ensure groups did not differ in baseline task performance, we conducted one-way ANOVAs comparing pre-sleep performance across conditions. No significant differences were found for GLT accuracy (F(4, 97) = 0.05, p = 1.00, ω2 = 0.00), MSL accuracy (F(4, 47.30) = 0.81, p = 0.53, ω2 = 0.000), Piano pitch accuracy (F(4, 96) = 1.71, p = 0.15, ω2 = 0.00), or Piano rhythm accuracy (F(4, 46.32) = 0.17, p = 0.95, ω2 = 0.00), confirming comparable baseline performance across groups. To assess potential sleep inertia effects, we compared post-sleep PVT reaction times across conditions following the nap opportunity using a one-way ANOVA. No significant differences were found between conditions, (F(4, 97) = 0.73, p = 0.57, ω2 = 0.00), indicating that sleep inertia did not differentially affect alertness across groups. In the final Wake group (N = 20), 18 participants had no N2 or N3 sleep, while 2 participants had brief episodes of deeper sleep (1.5 and 3.5 minutes of combined N2 and N3, respectively). These participants were retained as the minimal exposure to oscillation-containing sleep stages was unlikely to affect our experimental contrasts.

Electrophysiological effects of brain stimulation

To document the electrophysiological effects of auditory stimulation we compared the evoked responses in each condition in the two frequency bands of interest (i.e., 0.1–4 Hz for slow wave activity and 11–16 Hz for spindle activity). A clear evoked slow oscillation was observed when sound was delivered during SO-up-states, with statistically-significant differences from sham (Sleep condition) between 0.5 and 1.5 seconds post stimulation (Fig. 2a, top), replicating the results described in refs. 33,36,37,40. Similarly, a clear evoked slow response was observed when sound was sent simultaneously with sleep spindle detection (SP condition) or 450 ms later (SPd; 2a, bottom), noting that the difference in the wave form across the SO and two SP stimulation conditions are due to the presence of an SO in the former case (and differences in the recording equipment and montage, see Methods); each waveform is therefore compared with the equivalent detection in the sham condition (Sleep). In both conditions the stimulation induced statistically-significant changes in amplitude for the majority of the duration between stimulation and 2 s.

Fig. 2: Neurophysiological effects of closed-loop auditory stimulation.
Fig. 2: Neurophysiological effects of closed-loop auditory stimulation.
Full size image

CLAS of both slow oscillations (detected at Fpz) and sleep spindles (detected at Cz) enhances (a) slow wave activity (0.1–4 Hz) and (b) spindle band activity (11–16 Hz). Black vertical lines represent stimulation onset. Dashed vertical lines represent the onset of spindle detection in the delayed stimulation condition (SPd). Solid lines indicate group mean and shaded lines represent standard error of the mean. Grey shaded rectangles delineate time windows used to extract the magnitude of evoked activity in each frequency band. Statistical differences comparing stimulated to unstimulated sleep (i.e., Sleep condition) are represented in the bottom panels. Solid areas represent corrected p-values and coloured shading represent uncorrected p-values.

Next, we evaluated the effects of each stimulation type on spindle band activity (11–16 Hz). Concerning the SO condition (i.e., auditory stimulation coinciding with the SO up-state), we were also able to replicate the findings described in earlier work of increased spindle activity between 750 ms and 1.5 s post stimulation38,41,43,44,53,54; see Fig. 2b, top. Additionally, a transient decrease in spindle band activity was observed relative to sham ~500 ms post stimulation (although it did not survive multiple comparisons correction).

A clear increase in spindle activity between 750 ms and 1.5 s post stimulation was observed in both the SP and SPd conditions (see Fig. 2b, bottom). In the SP condition, stimulation appeared to truncate the stimulated spindle ~500 ms post stimulation, corroborating earlier results51.

To inform parameter selection in future closed-loop auditory stimulation paradigms, we investigated the correlations between evoked and detected activity (note that correlation results are reported uncorrected for multiple comparisons; complete results are available in tabular format in Supplementary Table 2). For each of the three stimulation conditions, we computed correlations between the magnitude of the detected oscillation (either SO or spindle according to the condition) and evoked oscillations in (a) the SO and (b) the spindle band. We also computed (c) the correlation in magnitude between the evoked oscillations (i.e., SO and spindle activity). The results are presented schematically in Fig. 3.

Fig. 3: Correlations between detected and evoked signal magnitude.
Fig. 3: Correlations between detected and evoked signal magnitude.
Full size image

The mean magnitude of subjects’ slow oscillations in the SO-upstate stimulation condition (a) was significantly correlated with the amplitude of their evoked slow oscillation. No correlations were observed between detected and evoked amplitudes in the spindle stimulation (SP) condition (b). The mean magnitude of the detected spindle was correlated with that of the evoked spindle in the delayed stimulation (SPd) condition (c). STIM: timing of stimulation, RESP: timing of response measurement, n.s.: non-significant.

For the slow oscillation condition, all detected and evoked values were computed at the Fpz electrode. There was a significant correlation between amplitude of SO at detection and amplitude of the evoked SO (r = 0.73, p < 0.001), suggesting that participants with overall stronger SOs at detection were those who showed stronger evoked SOs. The amplitude of SO at detection was not correlated with the magnitude of the spindle activity (r = -0.07, p = 0.78), suggesting that participants with larger SOs were not necessarily those who produced larger spindles after stimulation. The magnitude of the two evoked responses was not correlated (r = 0.08 p = 0.75), suggesting that strong SO evocation did not imply strong spindle evocation.

For the two spindle stimulation conditions (i.e., SP and SPd), all detected and evoked values were computed at the Cz electrode. In the SP condition (i.e., stimulation during spindles), none of the correlations tested were significant: detected spindle magnitude did not correlate with the magnitude of evoked slow wave activity (r = -0.02, p = 0.93), nor evoked spindle activity (r = 0.37, p = 0.09), and the amplitude of evoked slow wave activity was not correlated with that of evoked spindle activity (r = 0.27, p = 0.22). These results indicate that the average magnitude of evoked responses cannot be predicted from a subject’s mean spindle activity upon detection, when stimulating during a spindle.

In the SPd condition (i.e., stimulation 450 ms post spindle detection), the correlation between magnitude of spindle activity at detection and evoked slow wave activity was not significant (r = 0.02, p = 0.93). However, a significant relationship was found with the magnitude of evoked spindle activity (r = 0.66, p = 0.004). We found no correlation between the magnitude of the evoked responses (r = 0.33, p = 0.19). These results indicate that the average magnitude of evoked spindle (but not slow wave) activity can be predicted from a subject’s mean spindle activity upon detection, when stimulating after a spindle.

Behavioural effects of brain stimulation

Having characterized the neurophysiological responses and their individual variability, we next examined behavioural outcomes. To evaluate the effect of the experimental manipulation on behavioural performance, a repeated measures ANOVA (rmANOVA) was employed for each task investigating the changes in performance over Time (i.e., between the pre-and post-experimental manipulation measurements) and between Conditions.

The rmANOVA results for the Grid Location Task revealed a significant main effect of Time (F(1, 97) = 29.69, p < 0.001, ω2 = 0.03). However, the interaction between Time and Condition was not significant (F(4, 97) = 0.11, p = 0.98, ω2 = 0.00), indicating that while performance generally decreased during the interval between pre- and post-testing, the experimental conditions did not differentially affect the change in performance on the declarative memory task (GLT) over time (see Fig. 4a).

Fig. 4: Behavioural performance.
Fig. 4: Behavioural performance.
Full size image

Learning curves averaged across all participants (top row) and change in performance by condition for each behavioural task (bottom row), for (a) the declarative task (GLT), (b) the procedural task (MSL), and the complex task (piano learning), which was measured by (c) pitch accuracy and (d) rhythm accuracy. While all task metrics except piano pitch accuracy showed a global change over time across conditions, there were no significant interactions between Time and Condition for any task. Grey boxplots in the bottom row represent the main effect of Time averaged across all conditions. GLT Grid Location Task, MSL Motor Sequence Learning task, n.s.: non-significant.

The rmANOVA results for the Motor Sequence Learning task revealed a significant main effect of Time (F(1, 96) = 5.19, p = 0.03, ω2 = 0.01). However, the interaction between Time and Condition was not significant (F(4, 96) = 0.71, p = 0.59, ω2 = 0.00), indicating that while all participants improved their performance on the task, the experimental condition did not differentially affect their change in performance on the procedural task (MSL) over time (see Fig. 4b).

Concerning pitch accuracy in the Piano Learning task, the rmANOVA results indicated that the main effect of Time was not significant (F(1, 96) = 1.52, p = 0.22, ω2 = 0.00). Similarly, the interaction between Time and Condition was also not significant (F(4, 96) = 1.13, p = 0.347, ω2 = 0.00), suggesting an absence of change in performance between the pre and post manipulation measures in term of pitch accuracy on the Piano task (see Fig. 4c).

Concerning rhythm accuracy in the Piano Learning task, the results of the rmANOVA showed a significant main effect of Time (F(1, 96) = 5.19, p = 0.03, ω2 = 0.01), but not in interaction with Condition (F(4, 96) = 0.71, p = 0.59, ω2 = 0.00). This indicates that while participants across all groups improved their rhythm accuracy performance, this improvement did not differ systematically between experimental conditions (see Fig. 4d).

To explore the causal relationship between the strength of evoked brain oscillations and task performance improvements, we examined the correlation between stimulation effectiveness (quantified by evoked oscillation amplitude) and performance change (see Fig. 5). We tested all correlations for each task and each evoked responses in both the slow oscillation and sleep spindles stimulation (including delayed) groups, focusing on comparisons that are most informative for the present research questions. Complete results for all correlations are found in Supplementary Table 3.

No statistically significant correlations were found in the relationship between change in performance in the declarative task and amplitude of the evoked response in either frequency band of interest (for complete results see Supplementary Table 3). Comparison of the change in performance in the accuracy of the procedural task did not yield significant results concerning the correlations between improvement on the task and evoked sleep spindle activity in either conditions (SP: r = 0.27, p = 0.23; SPd: r = 0.45, p = 0.08). Analysis of the Piano task revealed distinct patterns when examining pitch versus rhythm accuracy. For pitch accuracy, we identified a significant negative correlation between evoked spindle activity and performance accuracy in the SP condition (r = −0.50, p = 0.02). Similarly, for rhythm accuracy, evoked spindle activity in the SPd condition showed a significant negative correlation with rhythm performance (r = −0.57, p = 0.02).

Discussion

The purpose of this study was to compare the effects of closed-loop auditory stimulation when applied directly upon spindle detection, with a delay following detection, and using the more common slow oscillation-upstate stimulation, to ascertain its effects on physiology as well as declarative, procedural, and complex memory consolidation.

We first investigated the physiological effects of the three stimulation conditions. We found robust physiological responses in all three stimulation conditions as compared to sham (Fig. 2), which were broadly similar to one another and to the effects reported to open-loop stimulation (i.e., stimulation presented randomly during N2 and N3 sleep;55,56,57). Namely, presenting sounds in sleep that are not loud enough to cause awakening evokes a slow oscillation (also referred to as a K-complex58); and increased spindle activity about a second later. Note that it is not possible to compare the SO and spindle stimulation conditions quantitatively in this project because a different recording system and EEG montage were used. The results suggest that all three kinds of stimulation affect the strength and timing of the evoked oscillations, which are of interest for sleep-dependent memory consolidation. If these evoked oscillations are able to reactivate temporarily-stored memories and stimulate their replay, then all three stimulation conditions may provide the necessary circumstances for memory consolidation (noting that the present study does not investigate the informational content of the events). However, timing of stimulation did seem to matter. Stimulation presented immediately upon spindle detection (SP condition) appeared to terminate the spindle early (Fig. 2b, bottom). This shortening of the sleep spindle was not observed in the delayed condition (SPd), where only increased spindle activity post stimulation was present. This observation is coherent with our previous work investigating CLAS of spindles, which compared the physiological effects of stimulating the first versus second half of a spindle. Only early stimulation generated this effect51. The results across both studies suggest that neural input during the spindle might result in its termination, as proposed previously59. If the endogenous spindles were involved in information transfer to cortex for long-term storage, this could mean that early stimulation might interrupt this process in the SP condition but not the SPd condition. In future work, to further test this idea, instead of presenting single sounds immediately upon spindle detection, trains of clicks could be used (similarly to Ngo et al.36) as a means of repeatedly shortening subsequent evoked spindles, so as to adduce evidence for their role in information transfer through loss-of-function. Disrupting endogenous activity selectively can also be leveraged to causally infer the role of spindles with different characteristics such as those defined by their temporal occurrence with other spindles (i.e., if they are present in trains or isolated15,16) or with slow oscillations (i.e., coupled spindles13,60,61,62).

Brain stimulation studies frequently note differences in evoked response amplitudes63,64, which might be correlated with effectiveness of memory manipulations or therapeutic interventions65. These variations may prove important for understanding how memory systems can be influenced, and to predict who might be most susceptible to different sorts of modulation techniques, for which reason we investigated correlations between detected and evoked signal magnitude in each stimulation condition (summarized in Fig. 3). The largely null correlations between detected and evoked activity suggest that stimulation effectiveness is not simply determined by baseline oscillation strength, indicating that other factors determine individual responsiveness to CLAS. Subjects with larger SOs at detection tended to be those who produced larger evoked SOs, and subjects with larger spindles at detection tended to be those who showed larger evoked spindle activity (in the SPd condition but not SP). Interestingly, the average strength of one’s SOs or spindles upon detection was not linked to how strong the opposite type of evoked oscillation was. This observation suggests that while the stimulation target seems to be successfully evoked, the effect does not represent an overall susceptibility of one’s brain to response to stimulation across both frequency ranges. Additionally, the magnitude of evoked SOs and spindles did not correlate with one another across any of the stimulation conditions, suggesting that despite temporal co-occurrence suggested by average time series across participants (as observed in Fig. 2), individuals’ susceptibility to produce evoked SO and spindle activity was not linked. This result implies that auditory stimulation may not consistently generate coupling. In our previous work66, we found partly diverging results in a paradigm focusing on stimulating SO-spindle coupled events (note that coupling analysis requires multiple nights’ data due to its low prevalence, and was therefore not examined in the present work). In brief, the amplitude of oscillations at detection did not predict the amplitude of the response in either frequency band. However, significant correlations were found between strength of evoked SO and spindle activity, but only when auditory stimulation occurred at specific spindle phases (i.e. rising and peak), suggestive of a common generative mechanism under some circumstances.

Interpretation of the correlational analyses between detected and evoked oscillation strength (and those in Jourde et al.66) must be tempered by the caveat that the populations studied are limited to healthy young adults, who show less variability in SO strength than do other populations such as older adults (who are of interest as they might ultimately be a target for CLAS-based interventions13,62). Furthermore, the range of SO strength at detection is restricted by the detection algorithm. Slightly weaker-than-threshold SOs were not detected, thereby truncating the range of possible values and lowering the likelihood of finding correlations. It is entirely possible that, given a broader range of SO or spindle amplitudes at detection that are present in different populations, additional dependencies will be revealed which may be useful to predict or tune the effectiveness of auditory manipulations.

Our results nonetheless underscore the importance of timing stimulation to neural events, precisely and selectively, to evoke different neurophysiological responses. They also suggest some independence between the generation and degree of susceptibility of the neural circuits generating the two types of evoked responses, which could be leveraged in future causal investigations of the roles of sleep oscillations. Future work on algorithms may also find ways of optimizing detection to individuals (as demonstrated via online adaptation50). Another approach might be to adapt detection algorithms or stimulation parameters in real time, based on the effectiveness of each stimulation.

Having established the neurophysiological effects of our stimulation protocols, we next examined their behavioural consequences using three tasks with wake and undisturbed sleep control groups. The behavioural outcomes proved to be complex. First, we confirmed that the task learning results showed reasonable improvements during the training periods, suggesting that subjects were able to learn and developed a memory trace upon which processes of offline memory consolidation could act, as evidenced by a flattening learning curve and overall high performance level at the end of the training period (Fig. 4, top row). Next, we investigated the main effect of behavioural change before and after the nap or wake period (Fig. 4, bottom row; summarized in the grey boxplot), and differences between conditions. Three out of four behavioural metrics showed effects of time, but not all were positive. Overall, people got worse on the declarative task (GLT), better at the procedural task (MSL), and better at the more procedural complex task metric (i.e., Piano rhythm accuracy), whereas the more declarative complex task metric (i.e., Piano pitch accuracy) did not change significantly. These results are interesting in themselves, as they suggest that declarative and procedural memory consolidation mechanisms are in fact somewhat dissociable, at least with regards to their decay versus gain due to the passage of time (factors which are likely also affected by the complexity of the task, the level of expertise of the learner, and the volume of training67). In fact, sleep-dependent memory effects are sometimes observed as ‘less forgetting’ rather than an actual gain68.

In line with the growing recognition of inconsistent sleep-memory relationships29, the condition to which the subjects were assigned seemed to have little to do with the degree of change of performance in pre- versus post-testing, as indicated by the lack of any statistical interaction between Time and Condition (Fig. 4, bottom row). These results mean that not only did the stimulation not improve consolidation at least at the group level, but that undisturbed sleep did not improve consolidation (or reduce forgetting) beyond that expected by the passage of time.

The overall increase in procedural performance observed across all conditions, including wake, is consistent with emerging perspectives that consolidation processes operate across multiple brain states rather than being truly sleep-dependent30,69. However, the absence of sleep-dependent memory effects in our nap design warrants careful interpretation and limits conclusions about stimulation specificity, as discussed below.

Noting considerable variability in both the amplitude of evoked responses and in the behavioural change pre- versus post-sleep, we also explored correlations between the evoked activity in each frequency band and task-related changes at the subject level, for each stimulation condition (summarized in Fig. 5a). While the strength of evoked slow oscillations was not correlated with change in any of the behavioural metrics, the evoked spindle activity showed a sub-threshold positive relationship with MSL accuracy in both the spindle stimulation conditions but not SO stimulation condition. Conversely, evoked spindle activity was significantly negatively correlated with Piano pitch accuracy in the SP but not SPd condition (uncorrected), and was negatively correlated with Piano rhythm accuracy in the SPd but not SP condition (also at uncorrected critical values). Although we hesitate to interpret the direction of these individual effects in light of current models of memory consolidation due to their weak statistical properties and inconsistency, it seems plausible that the magnitude of evoked response in the spindle band may prove to be an important factor in determining memory outcomes, at least across tasks that have a procedural component. Note that the modest correlations with piano performance (p = 0.02) would not survive correction for multiple comparisons, for which reason we emphasize their tentative nature and the need for replication in future studies.

Fig. 5: Correlations between change in performance on the behavioural tasks and magnitude of evoked responses.
Fig. 5: Correlations between change in performance on the behavioural tasks and magnitude of evoked responses.
Full size image

a Summary of statistical significance and direction of correlation for all tasks and evoked responses. b Change in performance on the declarative task (GLT) was not correlated with evoked spindle activity in either the spindle stimulation conditions (SP, SPd). c Although correlations were not significant between the change in performance on the procedural task (MSL) as a function of evoked spindle activity, both conditions (SP, SPd) showed a positive trend. Evoked spindle activity negatively correlated with (d) pitch accuracy in SP condition and (e) rhythm accuracy in SP-delayed condition. Red: spindle stimulation (SP) condition, orange: delayed stimulation (SPd) condition. This discrepancy may be suggestive of the importance of stimulation timing for consolidation of complex tasks. Note that correlations were exploratory and significance values were not corrected for multiple comparisons. GLT Grid Location Task, MSL Motor Sequence Learning task, n.s. non-significant.

The robust physiological responses observed across all stimulation conditions, combined with the absence of clear behavioural benefits, raises considerations for optimizing CLAS parameters. Rather than indicating ineffectiveness of the approach, this dissociation suggests that memory consolidation may depend on more specific aspects of neural oscillations than those targeted by our current stimulation protocols. Research emphasizes the importance of slow oscillation-spindle coupling for memory consolidation12,13, with better temporal coordination predicting superior behavioural outcomes. Our stimulation approach, while successfully evoking individual oscillations, may not have optimally enhanced this coupling. Future work could target coupled SO-spindle complexes specifically, as demonstrated in recent methodological advances66. Additionally, the timing and characteristics of endogenous oscillations may be critical. Animal studies suggest that the precise phase relationships within and between oscillations determine their functional effectiveness70. In humans, correlational work indicates that spindle frequency, duration, and density all relate to memory gains15, suggesting that future stimulation protocols might benefit from targeting specific spindle characteristics rather than spindles in general. The early termination of spindles observed with immediate stimulation (SP condition) versus the enhancement observed with delayed stimulation (SPd condition) illustrates how timing critically affects physiological outcomes. It is possible that disrupting ongoing spindles interfered with endogenous memory processes, while delayed stimulation may have been more conducive to memory consolidation by extending or enhancing spindle activity without interruption. Our findings therefore provide a foundation for future parameter optimization while highlighting the complexity of translating physiological manipulation into behavioural benefits. The systematic documentation of timing-dependent effects offers valuable guidance for developing more targeted stimulation approaches that may better align with the natural dynamics of memory consolidation processes.

While a benefit of SO-CLAS on memory has been replicated, results are found inconsistently across studies. In a recent review, Esfahani et al.71 reported that all studies in healthy young and middle-aged adults effectively generated evoked neural responses, yet only about 40% showed significant memory improvements33,36,37,38,53,72; effects on memory were inconclusive in a further seven studies39,40,41,42,43,44,54 (summarized in Supplementary Fig. 1). These mixed findings align with broader recognition that sleep-memory relationships are less straightforward than previously assumed29, with numerous contradictory and non-replicable findings reported across the field. Our null behavioural results, combined with robust physiological effects, contribute to this pattern. These results could also suggest that either the effect of auditory stimulation is not strong enough to reliably produce memory effects (possibly meaning that more forceful brain stimulation techniques such as transcranial magnetic stimulation are needed), or that CLAS works but is not optimized in commonly used designs, including in the present study. CLAS’s effects may be weak due to sub-optimal stimulation strategies. In prior work using simultaneous EEG and source-localized magnetoencephalography, Jourde et al.55 showed that the effectiveness of sound stimulation to generate evoked responses was determined by the tissue excitability state in frontal ventral regions (i.e., orbitofrontal cortex), but that up-states in these regions coincided with up-states as detected in frontal EEG channels (as is used in most work, see ref. 71) in only 12% of cases. This result suggests that there is a lot of room for improvement for maximizing the effectiveness of CLAS-SO, by either using a different electrode montage or optimizing timing. Similarly, determining the best means of capturing spindles and timing stimulation (e.g., to hit spindle up-states, see ref. 66, or early versus late in their temporal evolution51), may improve the effectiveness of stimulation and therefore the consistency of cognitive and memory effects. In general, while the neurophysiological outcomes are consistently observed, it will be necessary to develop highly precise, effective, and perhaps personally-optimized causal methods, and to further develop mechanistic models of how stimulation interacts with memory circuits in sleep55,56.

Some design choices made in the present study are important to interpret the current results, and when considering the design of future work. While our study was adequately powered to replicate the robust neurophysiological effects of CLAS, behavioural effect sizes in sleep-memory research are highly variable and inconsistent. Recent meta-analyses suggest that sleep-memory studies can require 30–200 participants to detect behavioural effects reliably, reflecting the inconsistent nature of these effects rather than simply their magnitude. However, the absence of even trending behavioural effects in our data suggests the issue may not simply be statistical power. Recent reviews emphasize that sleep-memory relationships are more nuanced than previously assumed, with memory benefits occurring only under specific, still poorly-understood conditions29.

The nap design employed in this study represents both a strength and limitation that requires careful consideration for future work. While nap paradigms provide experimental control by eliminating circadian confounds and enabling precise timing of learning-sleep intervals, they may be inadequate for demonstrating robust sleep-dependent memory effects. Our finding that procedural memory improved regardless of sleep/wake condition, while declarative memory declined across all groups, could mean that the consolidation processes we aimed to study may require longer periods of sleep. This interpretation gains support from the broader CLAS literature: reviewing studies summarized in Supplementary Fig. 1, we note that the majority of null behavioural findings come from nap rather than overnight designs, with none of the three previously-reported nap studies yielding positive outcomes39,43,44,71. This pattern suggests that effective CLAS may be ‘dose-dependent’, requiring either a critical number of stimulations achievable only over a full night, or it could mean that specific aspects of nocturnal sleep architecture such as multiple NREM-REM cycles that are absent in afternoon naps are necessary for memory benefits to manifest. Future work should prioritize overnight between-subjects designs that include both wake and undisturbed sleep control conditions — a design feature not always implemented in studies claiming sleep-specific effects. Such designs are essential for establishing that sleep itself provides memory benefits before attributing additional effects to stimulation interventions.

We elected to use three tasks (a simple declarative task, a simple procedural task, and a complex task) in the same training session, as a means of being able to compare the effects of stimulation across well-studied representative tasks that target specific memory systems, and to extend the work to a complex task that has some ecological validity. As with any complex task, there may be additional considerations for what exactly is being learned and the nature of the timecourse of that learning process, which may be differentially dependent upon sleep processes, if at all. It is possible that doing three tasks prior to the nap or wake period created some interference, or merely decreased the amount of experimental time that could be dedicated to learning each task and thus the quality of the memory representation upon which consolidation could act. Interference effects tend to be weak and are typically elicited under specific experimental conditions designed to maximize interference73, in which overlapping information is given to disrupt an existing memory trace68. The MSL and piano tasks do share similar properties (finger tapping, sequence learning with the left hand) that could potentially create interference effects. However, task order was randomized across participants to avoid systematic order effects, meaning that interference would only affect a subset of participants depending on the task sequence they experienced. Furthermore, while such interference effects are theoretically possible, the brief duration and limited scope of the shared motor demands suggests that they would be minimal in the present design. Given that real-world learning typically involves multiple, overlapping motor and cognitive activities throughout the day, meaning that interference effects cannot dominate human learning, and that interference effects even on highly overlapping laboratory tasks are weak, it seems unlikely that strong interference effects are present that could explain our null results. With regard to length, the training paradigm used for the GLT and MSL task are similar to implementations in previous work. Although the specific piano task has not previously been used in the sleep context, its length seems comparable to related work74 and average pitch accuracy was above 80%, suggesting that subjects had a decent representation of the melodies in short-term storage. Although it seems unlikely that interference or insufficient task practice are driving factors behind the behavioural results, future work using complementary designs with multiple or single tasks and both comparing across-tasks within individuals, will all be needed to further clarify the roles of sleep in naturalistic human learning.

Future investigations using higher spatial resolution methods such as magnetoencephalography or high-density EEG would be better suited to examine hemisphere-specific effects of spindle stimulation, particularly given that our motor tasks were performed with the left hand. The broad scalp distribution of spindle activity detected with standard EEG montages limits the ability to investigate lateralized effects that may be relevant for unilateral motor learning. Additionally, device limitations may have contributed to the absence of clear sleep-memory associations. While the Dreem headband achieved 83.5% sleep staging accuracy compared to expert scorers75, and our stimulation devices successfully evoked physiological responses in past work51,66, the spatial resolution of our recording systems may have been insufficient to capture more subtle or localized sleep-memory relationships. Future work using higher-density EEG or MEG recording montages might reveal sleep-memory associations that our current approach could not detect.

Another important limitation of the current study is our inability to compare broader sleep metrics (power, event density, coupling) across stimulation groups due to the use of different EEG systems with distinct montages and configurations. Future work using standardized recording systems across all conditions would enable investigation of whether CLAS affects overall sleep architecture beyond immediate evoked responses. Such analyses are critical for understanding whether stimulation redistributes existing sleep events temporally or genuinely enhances sleep oscillations. Our recent within-subject work demonstrates that stimulation timing affects cumulative oscillatory power beyond immediate responses51, highlighting the importance of examining these broader effects in future between-group designs. Understanding whether CLAS redistributes existing sleep events versus genuinely enhances them is crucial for interpreting memory benefits and optimizing stimulation protocols. If stimulation primarily redistributes events through homeostatic mechanisms, future research might focus on targeting optimal temporal windows rather than increasing event density.

Other considerations include the intent of stimulation. While we had aimed to increase SO and spindle activity as a means of increasing memory consolidation, it is difficult to know using this approach whether the evoked oscillations are equivalent to endogenous memory reactivation, although they seem morphologically similar55,58. For this reason, disruption may be a more powerful means of first identifying the roles of sleep oscillations76. The observed early termination of spindles in response to spindle stimulation may thus be harnessed to further investigate spindles’ roles via their disruption. In the current work, we successfully evoked brain responses using both SO upstate and spindle stimulation, but did not observe clear behavioural changes between experimental conditions as compared to control conditions. The results from prior CLAS studies targeting slow oscillation upstates have also shown clear neurophysiological responses but mixed memory outcomes. Given results showing that timing and source of endogenous events affects the effectiveness of stimulation in modifying brain responses, we suggest that further methods-focused development will improve the effectiveness of CLAS and thus make it a stronger tool for studying neuroplasticity and memory, with some clinical potential77,78.

Our results contribute to the discussion about sleep’s role in memory consolidation29,30, and raise two conceptual questions. First, given that memory consolidation can occur across different brain states, what specific conditions determine when sleep provides benefits over wake? Second, how can evoked brain activity during sleep be present without corresponding improvements in memory performance? These questions, along with the relative contributions of spindles and slow oscillations to different sorts of memory, remain stubbornly open. We have nonetheless taken several important steps towards addressing them. Specifically, our results demonstrate successful modulation of slow oscillations and sleep spindles, confirming effectiveness of closed-loop auditory stimulation. The differences in timing and amplitude of evoked activity in response to stimulating different neural events highlight the importance of precise online detection. More importantly, it provides researchers with specific targets that yield diverse outcomes both neurophysiologically and behaviourally. Our findings must be interpreted within the context of our nap design limitations. The absence of sleep-dependent memory effects in our control conditions prevents definitive conclusions about stimulation specificity. Nonetheless, we believe this work establishes valuable research options to refine our understanding of sleep, facilitating the causal investigation of sleep events’ functions. In the future, in conjunction with advances in detection algorithms, this research direction will enable investigation of the roles of other memory-relevant neural patterns, including both coupled oscillations (i.e., SO-spindle complex) and grouped oscillations (i.e., spindle trains).

Methods

Study design

One hundred and twenty-six healthy neurotypical right-handed participants between the ages of 18–40 were recruited in total for this experiment, with the aim of including approximately 20 subjects per group who each slept at least 30 minutes in N2 and N3, combined. Sample size was determined based on neurophysiological effect sizes from our previous spindle-targeted CLAS work51. Power analysis indicated that N = 20 per group provided >99% power to detect neurophysiological changes (Cohen’s d ≈ 2.0) at α = 0.05. Behavioural analyses were exploratory, recognising that behavioural effect sizes in CLAS studies range from non-significant to moderate, making reliable power calculations for behavioural research questions challenging. We reasoned that consistent neurophysiological effects provide a foundation for exploring behavioural outcomes.

Pregnancy, BMI ≥ 40, and poor sleep habits, were grounds for exclusion due to potential effects on sleep quality79,80. Left-handedness was also an exclusion criterion to ensure homogeneous motor control across participants for the left-hand motor tasks. Individuals with over 2 years or 500 hours of music training were excluded to ensure similar baseline performance on the complex memory (piano) task. All participants self-reported having normal or corrected-to-normal hearing, no history of any diagnosed psychiatric or sleep disorder81, and were not taking medications that target the nervous system.

Across all conditions, fourteen participants were replaced for not sleeping enough. Of these, 6 had been in the Sleep condition, 4 in the SO condition, and 2 in each of the spindle conditions (SP and SPd). This distribution does not suggest that sound stimulation was the cause of poor sleep resulting in exclusion. Three subjects were replaced for falling asleep in the Wake condition (> 5 mins in N2 and N3 sleep, combined; values were 6, 7.5, and 40 minutes, respectively) and seven participants were excluded from the analysis due to technical issues. One hundred and two participants (age: M = 24.6, SD = 5.9) were retained in the main data analyses (72 F, 29 M, 1 non-disclosed; final sample sizes: Wake N = 20, Sleep N = 20, SO N = 20, SP N = 22, SPd N = 20; see Supplementary Table 1 for a breakdown of age and sex by experimental condition). Additionally, Portiloop data from three participants in the SPd condition and two subjects from the Sleep condition were unavailable due to technical issues. One subject from the SPd condition was unable to complete the Piano post-sleep task due to a wifi failure. Finally, four subjects – two from the SO condition, and one from each of the SP and SPd conditions – had no correct sequences in the motor sequence learning task post session (a common error pattern in this task is to omit one of the duplicated key presses as the first and last note of the sequence are the same). These subjects were removed only from the effected analyses. Participants received either course credits or financial compensation for their time. The study protocol was approved by the Concordia University Research Ethics Committee, and all participants signed an informed consent form.

The study procedure is illustrated in Fig. 1. The experimental session consisted of one five-hour visit. All participants completed the study at the same time of day (12–5 pm) to eliminate circadian confounds. They were asked to abstain from caffeine and alcohol the night before and the day of the experiment, and to abstain from nicotine, marijuana, and other recreational drugs for a week before the experiment, due to the effects of these substances on sleep architecture and latency82,83,84,85. Participants were instructed to go to sleep one hour later than their usual bedtime to increase sleep pressure and facilitate afternoon napping.

On the day of the experiment, all participants completed a series of sleep questionnaires. Following completion, participants were randomly assigned to one of five conditions: wake (Wake), no stimulation/sham (Sleep), slow oscillation stimulation (SO), spindle stimulation (SP) or delayed spindle stimulation (SPd). In the sleep condition, participants had the opportunity to sleep for up to two hours, and were asked to stay in bed and relax if they were unable to sleep. In the wake condition, participants were asked to read or write/draw for two hours. The use of electronics was not permitted. After the two-hour session, all participants were retested on the same cognitive tasks.

Participants completed all three behavioural tasks immediately before the assigned condition, with approximately 30 minutes between testing completion and the start of the sleep/wake period to allow for EEG setup. Task order was consistent within each participant (identical for pre- and post-testing) but randomized across participants to control for order effects. Between tasks, participants had the opportunity to take short breaks as needed.

Questionnaires

All participants were asked to complete the Montreal Music History Questionnaire (MMHQ86) prior to the experiment to confirm eligibility. The MMHQ is used to assess musical experience, including the number of training/practice hours completed, the age at which they began their formal training, as well as some basic demographic information. On the day of the experiment, participants completed a health questionnaire in addition to several sleep questionnaires including the Morningness-Eveningness Questionnaire87, the Pittsburgh Sleep Quality Index88 and the Epworth Sleepiness Scale89.

Cognitive tasks

The Psychomotor Vigilance Task (PVT90), was implemented to confirm participant alertness before and after the nap opportunity. The participant is shown a black screen with a red circle in the centre. Once the red circle disappears, participants are instructed to press the space bar of the computer as quickly as possible. This repeats for twenty trials within a five-minute period. Performance on the task was measured by the average reaction time in milliseconds on the fastest 10% of trials (as used in previous work91).

The Grid Location Task (GLT) is a declarative memory task requiring participants to remember the location of images that are sequentially displayed in random positions on a grid2. In the present implementation, 24 images were displayed in a 5 x 5 grid. Presentation and testing phases were rotated for a minimum of 2 cycles until a 70 percent threshold of correct responses is attained. Performance on the task was measured as the number of images placed within the correct grid location (referred to as ‘GLT Accuracy’, see Fig. 4a, top row, for learning curve across participants). The number of items (24) and learning criterion (70% accuracy) were determined through pilot testing, and differ from other implementations (e.g., Rasch et al.2 used 15 items, 50–60% criterion; Rudoy et al.92 used 50 items with a distance-based criterion). Parameters were optimised in the present work for reliable learning within approximately 20 minutes while maintaining sufficient difficulty for measurable consolidation effects in our healthy young adult population. Task parameter variations across studies reflect the assumption that robust sleep-memory relationships that have ecological validity should be demonstrable across reasonable task variations.

The Motor Sequence Learning (MSL) task is a procedural learning task that requires participants to use their left hand to repeat a five-key sequence on a keyboard as quickly and as accurately as possible93. The specified sequence was B-X-V-C-B in which B represents the index finger, while X represents the pinky finger. The participant was asked to repeat the sequence as quickly and accurately as possible for a 30-second block followed by a 30-second rest period, for a total of 12 active blocks (12 minutes). In literature, performance on the MSL task is reported using a variety of highly-related accuracy and speed metrics. We elected to use the number of correct sequences produced per 30 s block (referred to as ‘MSL Accuracy’), noting that due to the time constraints on sequence production, it is strongly correlated with speed-focused metrics such as the average speed of execution of correct sequences (r = −0.47, p = 5.04 × 10−7). Being a speed-dependent task, the measure is susceptible to the influence of sleep inertia, and participants require some re-familiarizing with the task before true performance may be assessed. For these reasons and following previous research94, the first two blocks after the sleep/wake interval were excluded as a warm-up period to avoid a decrease in performance due to sleep inertia and to re-establish task familiarity. Performance data is therefore averaged during the last three blocks of the pre-sleep learning period, and the third, fourth and fifth blocks from the post-sleep testing period (see Fig. 4b top).

Musical training, which incorporates visual, spatial, and motor abilities, and involves both declarative and procedural memory, is well-suited for exploring neuroplasticity in complex human learning95. Previous research investigating the causal role of sleep oscillations support a sensitivity of this type of learning to sleep74. The piano-learning task was adapted from a longitudinal design created by Herholz et al. 96 and consists of a learning session and two testing sessions. This task integrates both declarative and procedural memory representing a more ecologically valid task. The right-handed participants were asked to learn to produce 20 short melodies, using their left hand to increase difficulty and thus allow more room for motor skill improvement. The learning session consists of 6 trials (i.e., repeated attempts to reproduce a melody) for each of the melodies, in which participants first listened to them and then were asked to produce them on a MIDI keyboard. The melody is played to the participants with a visual representation of the keys needed on a keyboard diagram displayed on the computer monitor. Visual cues, in which the key sequence highlighted during the melody presentation, were offered only on the second and fourth trials, to enable novices to play the melodies. Visual cues were absent in the other trials to discourage reliance upon them and encourage plasticity relating to auditory-motor rather than visual-motor associations. After each trial in the learning phase, participants received feedback on their pitch and rhythmic accuracy in the form of two visual symbolic indicators (i.e., coloured smiley faces). Pitch feedback was binary (all keys correct or not), and rhythm feedback was given in three ranges of accuracy on the basis of pilot testing (good, moderate, poor).

Performance on the task was measured using two metrics: pitch accuracy, calculated as the percentage of correctly played notes in the melody, and rhythm accuracy, calculated based on inter-onset intervals (IOIs) in milliseconds – the time between consecutive note onsets. For each melody, the algorithm computed: (1) IOIs for both the reference melody and participant’s performance, (2) the absolute difference between expected and actual IOIs for each interval, (3) relative errors by dividing absolute errors by the average reference IOI duration, (4) the average relative error across all intervals in the melody, and (5) rhythm accuracy as 100% minus the average relative error. This score was then normalized based on each participant’s worst learning performance for each melody and inverted so higher values indicated better performance. For test trials where participants performed worse than at any time during training, normalized rhythm scores below zero were set to 0. This procedure served as a subject-specific outlier rejection threshold to exclude trials where extreme timing errors (e.g., stopping mid-trial upon error recognition) created negative accuracy values that do not reflect meaningful memory performance. The procedure was applied identically to both pre-test and post-test sessions. This task incorporates both procedural memory components (motor sequence learning, timing coordination, and sensorimotor skills) and declarative memory components (explicit knowledge of note sequences and conscious recall of melody patterns). While the motor execution aspects are primarily procedural, learning and reproducing specific melodies requires explicit memory for note sequences, making this a task that likely engages multiple memory systems. The integration of these memory systems in musical learning reflects the complexity of real-world learning situations, where task requirements cannot be neatly categorized into single memory domains97. For each task, we used Performance change as our main metric to measure behavioural changes after the 2-hour period of either sleep or wake. Performance change was computed as (Post − Pre)/Pre, where ‘Pre’ represents performance before the sleep opportunity (or wake period) and ‘Post’ represents performance afterwards.

Data collection

After completing these behavioural assessments, participants under- went the experimental manipulation while electroencephalography (EEG) data were collected. Electroencephalography is a standard non-invasive method of capturing neural oscillatory activity in the cortex. The electroencephalogram captures changes in electrical potentials between electrodes placed on the scalp98. EEG signals were collected using different devices due to the two stimulation conditions requiring different equipment (Endpoint Corrected Hilbert Transformation (ecHT); Elemind Technologies, Inc., Cambridge, USA) for SO stimulation and Portiloop49 for spindle stimulation conditions. All participants were equipped with both measuring devices for homogeneity of experience. The ecHT device acquired EEG data from the Fpz-M1 channel at a sampling rate of 500 Hz. The Portiloop recorded EEG signals at 250 Hz using four electrodes placed at midline locations (Fz, Fpz, Cz, and Pz) according to the international 10–20 system99, and referenced to the left mastoid. The ground electrode was positioned on the left ear lobe. Finally, most participants also wore the Dreem Headband (Dreem.com75), which was used for automatic offline sleep staging to quantify sleep duration in each stage, and to exclude waking epochs from physiological analyses. Note that real-time sleep staging was not used, as the detection algorithms identify slow oscillations and sleep spindles based on their waveforms, which occur naturally only during N2 and N3 sleep.

SOs peaks were automatically detected on Fpz10 using the ecHT device and its online detection algorithm. EEG data was low-pass filtered in the delta band (0.5–1.5 Hz) and was analyzed for SO characteristics derived from100: time points of positive to negative zero crossings were identified, then the intervals with sufficiently low negative peaks of −40 μV, with a duration of the negative peak between 125 and 1500 ms, and an amplitude range 75 μV between the peaks were isolated.

Sleep spindles were detected using the Portiloop49, a low-cost deep-learning-based stimulation system with a complex detection algorithm suitable for spindle CLAS. Spindle detection was performed at the Cz electrode, which effectively captures both slow and fast spindles101,102. The Portiloop’s detection algorithm has been trained to detect sleep spindles in real-time on a large dataset called MASS103 using offline detections by Warby et al.104 and labels by experts105. The current study used this device to deliver precisely-timed sound stimulation during spindles for the two spindle stimulation conditions (i.e., immediately upon spindle detection and with a 450 ms delay).

All stimulation conditions used the same auditory stimulus to maximize comparability: a 15 ms burst of 55 dB SPL pink noise (with 5 ms linear ramps to avoid click generation in the earphones). For participants in the SO conditions, auditory stimulation was delivered as soon as the SO-peak was detected. In the SP condition, stimulation was sent immediately upon spindle detection. In the SPdelayed condition, a delay of 450 ms was introduced between detection and stimulation to target the end-tail of sleep spindles as previous research suggested differential neurophysiological effects according to stimulation timing within the spindle51. Both closed-loop protocols (i.e., SO and spindle stimulation) began with 15 minutes of silence to allow participants to fall asleep, since eye blinks and movement artifacts during drowsiness can sometimes trigger false detections.

EEG data analysis

All data were analyzed in Python using custom scripts built on freely available packages. We used NumPy for array manipulation, SciPy’s signal module for filtering operations (including notch, bandpass, and band-specific filtering with butter and filtfilt), and Matplotlib for visualizing both filter frequency responses and average brain responses. In all analyses, filters were applied before defining epochs to prevent border effects. To control for data quality, we applied a 4th order Butterworth band-pass filter (0.5 to 30 Hz) and then applied a custom artifact rejection script to each recording that automatically identified and removed problematic data sections by detecting both absolute signal amplitude and sudden amplitude changes (typically caused by movement). After epoching and epoch-wise demeaning, epochs with absolute amplitudes exceeding ± 250 μV (on broadband filtered data; 0.1–40 Hz) were excluded from analysis. For analyzing the evoked responses in the slow-wave band, we applied a 4th order Butterworth band-pass filter (0.1 to 4 Hz) to the raw data. For analyzing spindle band activity, we used the same filter parameters but with an 11–16 Hz frequency band. Since phases of both detected and evoked sleep spindles can vary, we used the signal envelope (i.e., the absolute value of the Hilbert transform of the spindle band signal) to obtain its magnitude as our metric to estimate spindle power over time. Across all conditions involving detections of sleep events (Sleep, SO, SP, and SPd) epochs were extracted from −2.5 seconds before to 2.5 seconds after event detection and baseline-corrected using the mean amplitude from − 2.5 to 0.5 seconds prior to detection-a window chosen to avoid capturing amplitude changes generated by potential endogenous coupling106. For the participants in all conditions except SP, for which manual scoring was performed due to lack of equipment availability, sleep staging was extracted from the Dreem headband. Only epochs occurring during N2 and N3 were included in the analysis.

Average evoked responses in both frequency bands of interest: 0.5–1.5 Hz to capture slow wave activity (SWA) and 11–16 Hz to measure spindle activity (FSA) were computed for each participant by averaging all epochs. Timeseries were then averaged across participants for statistical comparison.

Slow wave activity at detection was extracted as the amplitude of each subject’s slow wave activity timeseries at time of detection. The evoked slow wave peak-to-peak amplitude was quantified as the difference between amplitude values measured at 550 ms (i.e., mean amplitude in a time window from 500 to 600 ms) and 900 ms post-stimulation (i.e., mean amplitude in a time window from 800 to 1000 ms) to account for latency variability between participants, following the methodology established in55.

Spindle activity at detection was extracted as the magnitude of each subject’s spindle band envelope timeseries at time of detection. The magnitude of the spindle envelope signal from 0.75 to 1.5 s after stimulation onset was computed to assess evoked spindle activity, following previous research using the same stimulation device and auditory stimuli51. To account for individual differences, a baseline correction was applied by subtracting each subject’s mean magnitude measured during the pre-stimulus period (−2.5 to −0.5 s relative to stimulus onset).

To inform parameter selection in future CLAS studies, we investigated correlations between the magnitude of detected oscillations and evoked responses. These exploratory analyses examine whether baseline neural activity strength predicts stimulation effectiveness, which could guide personalized stimulation protocols. We quantified relationships between the magnitude of the detected oscillation and evoked oscillatory activity within and across frequency bands. Correlations were exploratory and significance values were not corrected for multiple comparisons.

Statistical analysis

To confirm that groups did not differ prior to the experimental manipulation, we conducted a one-way ANOVA on age, a Chi-square test of independence on sex, and a one-way ANOVA on pre-learning PVT reaction time scores as a proxy for general alertness. To assess whether auditory stimulation negatively affected overall sleep quantity, we compared sleep duration (N2 and N3 combined) in each stimulation condition to the unstimulated Sleep condition using independent sample Student’s t-tests. Having confirmed group homogeneity, we next examined the electrophysiological effects of our stimulation protocols. To document the electrophysiological effects of auditory stimulation, we compared the evoked responses in each condition in the two frequency bands of interest across subjects in the stimulation condition versus sham (independent samples t-tests) for each time point, correcting for False Discovery Rate using a Benjamini-Hochberg correction (alpha = 0.05; Fig. 2). To inform parameter selection in future closed-loop auditory stimulation paradigms, we computed correlations between evoked and detected activity for each of the three stimulation conditions (Fig. 3). Baseline group equivalence was assessed using one-way ANOVAs on pre-experimental performance. Repeated-measures ANOVAs were used to assess changes in behavioural performance over Time (i.e., between the pre-and post-experimental manipulation measurements) and between Conditions. When assumptions were violated, Greenhouse-Geisser corrections were applied for sphericity violations and Welch’s correction for homogeneity of variance violations. To explore potential relationship between the strength of evoked brain oscillations and task performance improvements, we examined the correlation between stimulation effectiveness (quantified by evoked oscillation amplitude) and performance change using correlations, focusing on comparisons that are most informative for the present research questions. Spearman’s correlations were used throughout in case of normality violations.