Introduction

Sleep consolidates and transforms newly acquired information, making it long-lasting for future use1,2,3. While extensive evidence has established sleep’s benefits in consolidating episodic memory4, how sleep transforms memory remains unclear. Theoretical models propose that sleep transforms idiosyncratic memory episodes into generalized gist or schema5,6,7,8,9,10,11. This sleep-mediated memory transformation is largely inferred from pre- vs. post-sleep behavioral changes in memory tests implicating integration, generalization, and schematization12,13,14,15,16. Despite these promising findings, behavioral measurements may conflate memory decay with memory transformation and fall short in characterizing the complexity and fidelity of memory representations11,17. Therefore, it is desirable to obtain direct neural evidence delineating the sleep-mediated memory representational transformation.

We aim to address this question by leveraging the analytical power of representational similarity analysis (RSA) to examine memory representations at different levels in the human brain18. Specifically, RSA can decompose neural representations of individual items into item- and category-level representations19,20,21. Item-level representations capture neural representations unique to specific stimuli22, while category-level representations capture neural patterns shared across stimuli within semantic categories23,24. Applying the RSA to EEG recordings, we could examine both item- and category-level representations within pre-sleep learning and post-sleep retrieval sessions, respectively (i.e., within-session RSA), and item-/category-level representational similarity between pre- and post-sleep sessions (i.e., cross-session RSA). We hypothesize that if memory representations were transformed into more generalized gist over sleep, item-level representation would be diminished and even abolished after sleep (i.e., reduced item specificity)25 (Fig. 1A and B). In contrast, we anticipated that category-level representation should persist and remain identifiable in the post-sleep retrieval session (i.e., enduring gist-like information). Moreover, we expected that memory representational similarity between pre- and post-sleep sessions would be lower as compared to the memory representations within each session after the temporal proximity effects were well controlled.

Fig. 1: Experimental paradigm and analytic scheme of memory representational transformation across sleep.
figure 1

A The experimental procedure includes pre-sleep learning (i.e., encoding and maintenance), pre-sleep cued-recall and cued-recognition tests on half of the learned items, overnight sleep with TMR cueing during slow-wave sleep, and post-sleep tests, including cued-recall, mental retrieval with a written report and cued-recognition. EEG was recorded throughout the entire experiment. B Within-session RSA examined memory representations within each of the pre-sleep learning and post-sleep mental retrieval sessions; Cross-session RSA examined the memory representational similarity between these two sessions. Both the within-session RSA and cross-session RSA were performed at two different levels: item-level and category-level. Item-level representations were obtained by contrasting the within-item (WI) similarity versus the within-category (WC) similarity, while the category-level representations were obtained by contrasting the WC similarity versus between-category (BC) similarity. Memory representational transformation would be manifested by the following indices: decreased item-level representations, while persistently prominent category-level representations from pre- to post-sleep; and the low cross-session pre-post similarities. C Schematic illustration of the interactive roles of sleep stages and their associations with memory representational strength change. SWS*REM reflects the complementary roles, while REM/SWS reflects differential roles between these two sleep stages. SWS slow-wave sleep, REM rapid eye movement sleep.

More critically, how different sleep stages, particularly the SWS and REM sleep, interactively contribute to memory representational transformation remains controversial. A prominent theoretical perspective, encompassing the two-stage model of memory consolidation and the sequential hypothesis1,6,26,27, suggests that SWS and REM sleep complement each other in optimizing memory consolidation and transformation. Specifically, during SWS, repeated memory reactivation would integrate newly encoded memories into pre-existing memory schema, transforming hippocampal-dependent memory into more neocortex-dependent gist-like representations, whereas REM sleep further consolidates the reactivated memory during SWS through synaptic plasticity. These models highlight the interdependence of SWS and REM within the same night of sleep as a critical mechanism for effective memory consolidation. Supporting this hypothesis, animal studies suggest that SWS-initiated cortical plasticity for memory consolidation is reinforced by the following REM sleep episode28,29. Moreover, human studies showed that the product of the relative amount of SWS (%) and REM sleep (%) explains overnight memory consolidation better than either stage alone, indicating their complementary roles30,31,32. From this perspective, a higher SWS*REM should be associated with greater overnight memory representational transformation (Fig. 1C).

In contrast, alternative theoretical frameworks propose that SWS and REM sleep play differential and even opposing roles in sleep-mediated memory transformation9,33. According to these frameworks, SWS supports memory preservation through stabilizing item-specific representations formed during initial learning, whereas REM sleep facilitates memory transformation via gist abstraction and integration with semantic themes. Supporting these frameworks, research has shown that REM sleep duration is positively associated with schema-conformant memory consolidation and creative problem-solving, while SWS duration showed an opposite trend34,35. Consistent with this idea, a recent study showed that greater memory distortion or modification occurs after REM-rich sleep, while stabilization of the undistorted original memory occurs after SWS-rich sleep36. Within these theoretical frameworks, the REM/SWS ratio serves as a proxy for the relative engagement of transformation-oriented versus preservation-oriented consolidation processes. A higher REM to SWS ratio (i.e., REM/SWS) is therefore hypothesized to be associated with a greater shift toward memory representational transformation (Fig. 1C).

To empirically test these contrasting hypotheses, we combined overnight sleep EEG recordings with representational similarity analysis (RSA) to track memory transformation across sleep (Fig. 1). Participants learned 96 unique word-picture pairs, with pictures drawn from four categories: animals, plants, electronic devices, and transportation tools. After a distractor task, they completed cued-recall and cued-recognition tests on half of the pairs. Participants then underwent overnight sleep, during which targeted memory reactivation (TMR) cues from half of the learned pairs were delivered during SWS37. Approximately 30 min after waking, participants completed a post-sleep session consisting of a cued-recall test, a mental retrieval task with a written report, and a cued-recognition test for all items. During the mental retrieval task, participants heard cue words and were instructed to vividly imagine the associated pictures. Immediately after this task, they provided a written report of the retrieved content to verify accuracy. Each trial of the pre-sleep learning and of the post-sleep mental retrieval task lasted 5 s, with each item being repeated three times. Our results revealed substantial memory representational transformation from pre- to post-sleep, as indicated by reduced item-level representations, yet remained prominent category representations for successfully retrieved items. Notably, this transformation was associated with a greater proportion of REM sleep relative to SWS.

Results

A total of 35 participants (26 females; mean age ± SD: 22 ± 2.79 years) were included in the analysis. In the memory tests, the cued-recall task tested the category memory, whereas the cued-recognition task tested item memory. In the pre-sleep test session, participants showed a mean accuracy of 0.51 (SD = 0.17) in the cued-recall task and a mean accuracy of 0.78 (SD = 0.08) in the cued-recognition task. After an overnight sleep, cued-recall accuracy was numerically higher (mean ± SD: 0.54 ± 0.13; t(34) = −1.86, p = 0.072), while cued-recognition accuracy showed a significant decline (mean ± SD: 0.76 ± 0.07; t(34) = 3.07, p = 0.004). In addition, pre-sleep performance positively predicts post-sleep memory performance (all ps < 0.001, Supplementary Fig. 1).

To eliminate the effect of forgetting on memory transformation, the current study was mainly focused on trials with confirmed post-sleep retrieval success. Therefore, retrieval success was operationalized based on participants’ written reports following the post-sleep mental retrieval task. A trial was considered “remember” if the written descriptions reflected specific and accurate visual features of the target picture (e.g., “a white daisy”) corresponding to the cue word, and those with incorrect or blank responses were categorized as “forget” condition. On average, participants achieved an accuracy of 0.40 (SD: 0.18) in this written report task. Further analysis showed no significant difference between TMR cued versus uncued items in the retrieval accuracy based on the written report (t(34) = −1.68, p = 0.102, Supplementary Fig. 1) as well as in other memory tests (all ps > 0.523; Supplementary Fig. 1). The TMR effect was not the primary focus of the current study and has been reported in detail elsewhere37.

Overnight neural representational transformation for post-sleep remembered items

To understand how memory representation transforms across an overnight sleep, we first examined the neural representations during the pre-sleep learning session (i.e., including both encoding and maintenance periods) for post-sleep remembered items. Following previous studies19,20,38, we performed the RSA on the trial-based EEG power patterns to extract item- and category-level neural representations (Fig. 2A). These representations capture fine-grained item-specific information and generalized categorical information, respectively. For item-level representations, we contrasted the EEG power pattern similarities between trials of the same pictures (Within-item, WI similarity) versus the similarity between trials of different pictures from the same category (Within-category, WC similarity) (Fig. 2B, see also the “Methods” section). For category-level representations, we contrasted the WC similarity with the similarity between trials of different pictures from different categories (Between-category, BC similarity) (Fig. 2C).

Fig. 2: Neural representations for post-sleep remembered items within pre-sleep learning and post-sleep mental retrieval sessions, respectively.
figure 2

A Within-item (WI), within-category (WC), and between-category (BC) similarities during pre-sleep learning for post-sleep remembered items. B and C Significant item-level representations and category-level representations were identified for post-sleep remembered items in the clusters. D WI, WC, and BC similarities during post-sleep mental retrieval for post-sleep remembered items. E and F No significant item-level representations, but significant category-level representations were identified for post-sleep remembered items in the cluster. Significant clusters with pcluster < 0.05 were circled by black lines. Z(ρ): Fisher Z-transformed Spearman’s correlation coefficient.

To examine the temporal dynamics of representational transformation, we computed the similarity values by correlating the EEG power pattern across frequencies (2–40 Hz) and all clean channels between artifact-free learning trials, in 500 ms sliding time windows with a stride of 100 ms during the 5 s post-stimuli epoch. We observed significantly greater WI similarity than WC similarity within a ~ 500–4100 ms cluster post stimuli onset (pcluster = 0.003, corrected by the non-parametric cluster-based permutation test, Fig. 2B), indicating item-level representations. We also found significantly greater WC similarity than BC similarity within a cluster ~0–1100 and 2500–4800 ms post-stimuli onset (pcluster = 0.042, Fig. 2C), indicating category-level representations. To ensure these effects were not driven by below-chance similarity, we tested WI, WC, and BC similarity values within clusters and found all were significantly above chance, i.e., zero (all ps < 0.001). Control analysis on post-sleep forgotten items only revealed a cluster indicating item-level representations (pcluster < 0.001), but no category-level representations (pscluster > 0.806, Supplementary Fig. 2) during pre-sleep learning. Comparisons between post-sleep remembered and forgotten items revealed a significant cluster indicating Remember < Forget item-level representations (pcluster = 0.020, Supplementary Fig. 2) and a significant cluster indicating Remember > Forget category-level representations (pcluster = 0.018, Supplementary Fig. 2), which may reflect that greater transforming from item-level to semantic category-level representations during learning predicts better long-term memory39.

Next, we examined item- and category-level neural representations during the post-sleep mental retrieval session for post-sleep remembered items (Fig. 2D). In contrast to the pre-sleep learning session, we observed no significant clusters indicating item-level representations (pscluster > 0.318, Fig. 2E). However, we found significantly greater WC than BC similarity during ~0–3000 ms time window post cue onset, pcluster = 0.028, Fig. 2F), indicating category-level representations. Control analyses on post-sleep forgotten items did not reveal any significant clusters indicating either item- or category-level representations (all pscluster > 0.132, Supplementary Fig. 2). Direct comparison between remembered vs. forgotten items only revealed a significant cluster indicating Remember > Forget category-level representations (pcluster = 0.038), but there was no significant difference for item-level representations (pscluster > 0.602, Supplementary Fig. 2).

To ensure that the observed item-level and category-level representations were not confounded by differences in the number of trial pairs used in the RSA, we conducted a strict trial-matching control analysis. For each participant, we randomly selected the same number of trial pairs for the category-level RSA as were used in the item-level RSA. This control analysis revealed that category-level representations during both pre-sleep learning and post-sleep retrieval remained highly consistent with the results in Fig. 2C and F (Supplementary Fig. 3). Therefore, the differential changes between item-level and category-level representations after overnight sleep were not simply due to the pair number discrepancies in the RSA.

We next performed the cross-session RSA to examine the representational similarity between pre-sleep learning and post-sleep mental retrieval sessions (i.e., Pre-Post Similarity, Fig. 1B). Consistent with previous research39, we found no significant difference between Pre-Post WI versus WC similarities (i.e., item-level) or between Pre-Post WC versus BC similarities (i.e., category-level) for post-sleep remembered items (all pscluster > 0.314, Fig. 3A and B). However, post-sleep remembered items showed both greater Pre-Post WI and WC similarity than forgotten items (all pscluster < 0.049, Supplementary Fig. 4). These results suggested that while successful memory retrieval leads to greater cross-session neural pattern similarity than forgotten items, there were no discernible item-level or category-level representations preserved from pre- to post-sleep session.

Fig. 3: Cross-session representational similarities and their contrast with the within-session representational similarities.
figure 3

A and B No significant item-level or category-level Pre-Post similarity for post-sleep remembered items. C Within-session category-level representational similarities during both pre-sleep learning and post-sleep retrieval sessions were greater than the pre-post cross-session category-level representational similarity. The error bar indicates the standard error of the mean (SEM). Z(ρ): Fisher Z-transformed Spearman’s correlation coefficient. *p < 0.05.

Previous studies suggest that the presence of category-level representations within both pre- and post-sleep sessions, and their absence in cross-session RSA, may indicate distinctive memory formats after overnight sleep39,40,41,42. To test this, we compared within-session (Pre–Pre and Post–Post) category-level representational similarity with cross-session Pre–Post category-level representational similarity (see the “Methods” section). Specifically, Pre–Pre category-level similarity was computed as the average difference between WC and BC similarity within the significant cluster identified during the pre-sleep learning session (Fig. 2C). Similarly, the Post–Post category-level similarity was computed within the cluster identified during the post-sleep retrieval session (see Fig. 2F). The Pre–Post category-level similarity was computed as the average difference between WC and BC similarity, obtained by correlating neural activity across sessions using the significant time windows identified in the pre-sleep and post-sleep clusters. Results show that both Pre–Pre and Post–Post category-level similarity were significantly higher than Pre–Post similarity (Pre–Pre vs. Pre–Post: t(34) = 2.43, p = 0.020; Post–Post vs. Pre–Post: t(34) = 2.17, p = 0.037, Fig. 3C), suggesting persistent but distinctive category-level representations after overnight sleep.

To rule out the possibility that pre-sleep testing (on half of the word–picture pairs) following pre-sleep learning may influence the memory transformation over sleep, we compared the memory representations between pre-sleep tested items and pre-sleep untested items. The results revealed that, among post-sleep remembered items, no significant differences were observed between tested and untested items at either the item- or category-level representations during the pre-sleep learning or post-sleep mental retrieval sessions (all pscluster > 0.135, Supplementary Fig. 5). In addition, no significant difference was found for cross-session Pre–Post similarity at either the item-level or category-level representations (all pscluster > 0.340, Supplementary Fig. 5). Similarly, to address whether TMR cues presented during slow-wave sleep influenced memory transformation, we compared memory representations between TMR-cued and uncued items, again focusing on post-sleep remembered items. No significant differences were observed at either the item- or category-level during pre-sleep learning or post-sleep retrieval sessions (all pscluster > 0.187, Supplementary Fig. 6). In addition, no significant difference was found for cross-session Pre–Post similarity at either the item-level or category-level representations (all pscluster > 0.195, Supplementary Fig. 6).

Taken together, these results indicate that memory representations were substantially transformed after an overnight sleep, as manifested by the weakened item-level representations, while persistent but distinctive category-level representations were maintained from pre-sleep learning to post-sleep retrieval. Moreover, this memory transformation cannot be attributed to TMR cueing or pre-sleep testing, and thus items across these conditions were pooled together in all subsequent analyses.

REM/SWS, but not SWS*REM, is associated with memory representational transformation for remembered items

We next investigated how SWS and REM sleep influence memory representational transformation. To answer this question, we first scored the sleep EEG using the toolbox, yet another spindle algorithm (YASA)43, the results of which were further verified by an experienced sleep researcher (see the “Methods” section; see Supplementary Table 1, Supplementary Fig. 7 for sleep staging). One participant with disconnected EEG recordings during sleep was excluded, resulting in 34 participants in the following data analysis.

We hypothesize that if SWS and REM sleep play complementary roles in memory representational transformation, then the production of SWS% and REM% (i.e., SWS*REM, %*%) would be associated with memory transformation, i.e., reduced item-level representational strength and the relatively persistent category-level representational strength. In contrast, if SWS and REM sleep play differential roles in memory transformation, then the REM% relative to the SWS% (i.e., REM/SWS ratio) should be associated with memory representational transformation.

To test these hypotheses, we first quantify representational strength change over sleep by subtracting the mean similarity value of pre-sleep item-level or category-level representational strength from each cell of the corresponding post-sleep representational matrix. This resulted in a two-dimensional matrix, capturing time-resolved changes in post-sleep representational strength relative to pre-sleep learning. We then performed the correlation analysis between SWS*REM and each cell in the representational change matrix across participants (see Fig. 1C). The results revealed no significant clusters correlating SWS*REM with either item-level strength change or with the category-level strength change (all pscluster > 0.220, corrected by the non-parametric cluster-based permutation test, Fig. 4A and B).

Fig. 4: The interactive functional roles of SWS and REM sleep in memory representational transformation.
figure 4

A and B SWS*REM showed no significant correlation with either item-level or category-level representational strength change (Post minus Pre). C and D REM/SWS was negatively associated with item-level representational strength change. E and F REM/SWS was positively correlated with category-level representational strength change. Significant clusters with pcluster < 0.05 were circled by black lines. Z(ρ): Fisher Z-transformed Spearman’s correlation coefficient; *p < 0.05; **p < 0.01; ***p < 0.001.

In contrast, we found that the REM/SWS ratio was significantly associated with post-sleep representational strength change. Specifically, REM/SWS ratio was significantly negatively correlated with item-level representational strength change in a cluster spanning ~2100–4800 ms post cue onset (pcluster = 0.028, within the cluster: β = −0.022, adjusted R2 = 0.196, p = 0.005, Fig. 4C and D). Moreover, the REM/SWS ratio was significantly positively correlated with the category-level representational strength change in a cluster ~1700–4800 ms post cue onset (pcluster = 0.021, within the cluster: β = 0.014, adjusted R2 = 0.126, p = 0.022, Fig. 4E and F). Further examining the temporal dynamics underlying the observed clusters, we found that the earlier portion of the category-level cluster (~2100–3600 ms) was primarily driven by representational strength changes among the second and third post-sleep retrieval repetitions, whereas the later portion (~3700–4800 ms) was mainly associated with representational strength changes among the first and second repetitions (Supplementary Fig. 8). These results suggest faster retrieval speed across repeated mental retrieval trials. However, no such effect was found for the item-level cluster (Supplementary Fig. 8).

To test that the observed association between the REM/SWS ratio and memory representational strength changes was not attributable to either sleep stage alone, we correlated representational strength changes with REM% and SWS% separately. The results revealed no significant clusters for either item-level or category-level representational strength change (all pscluster > 0.075, Supplementary Fig. 9). We also examined the relationship between the REM/SWS ratio and representational transformation among post-sleep forgotten items. No significant clusters were found (all pscluster > 0.282, Supplementary Fig. 10). We further conducted a control analysis using only the maintenance period from the pre-sleep learning trials to better match the perceptual input with the post-sleep mental retrieval session when calculating the representational strength change for post-sleep remembered items. The results remain (Supplementary Fig. 11), suggesting that our observed results are not simply driven by differences in visual perceptual input between pre-sleep learning and post-sleep mental retrieval sessions.

These results collectively suggest that SWS and REM sleep play differential roles, instead of complementary roles, in memory representational transformation. The greater amount of REM sleep, in contrast to SWS, is associated with significant memory representational transformation across participants, as indexed by the reduced item-level representational strength and enhanced category-level representational strength.

REM and SWS EEG power are differentially associated with the neural representational transformation for remembered items

Beyond the REM and SWS duration, we next investigated what electrophysiological activities during REM sleep and SWS modulate this transformation. Prior studies have suggested that during REM sleep, frontal theta and beta activities contribute to memory consolidation44,45,46. Building on these results, we calculated the frontal (F3/4 channels) theta (4–7 Hz) and beta (15–25 Hz) power relative to the 1–40 Hz total EEG power (see the “Methods” section) across all REM sleep epochs. To better examine the effect of REM sleep power across the entire overnight sleep on memory representational transformation indexes, we created a comprehensive index of REM sleep power. Specifically, we multiplied the relative power in each frequency band by REM duration, resulting in the total power of each frequency band for REM sleep. The robust linear regression revealed that the total theta power across REM sleep was negatively correlated with item-level representational strength change within the cluster as shown in Fig. 4C (β = −29.324, adjusted R2 = 0.142, pFDR = 0.016, Fig. 5A), while positively correlated with category-level representational strength change within the cluster as shown in Fig. 4E (β = 21.600, adjusted R2 = 0.134, pFDR = 0.038, Fig. 5B). Similarly, total beta band power across REM sleep was negatively correlated with item-level representational strength change (β = −197.000, adjusted R2 = 0.168, pFDR = 0.016, Fig. 5C), with a positive but non-significant trend with category-level representational strength change (β = 101.700, adjusted R2 = 0.044, pFDR = 0.122, Fig. 5D). Note that similar pattern of results was found when correlating relative REM theta or beta power with the memory representational strength change (Supplementary Fig. 12). All these correlations were corrected across different frequency bands using the false discovery rate (FDR) method.

Fig. 5: The relationship between sleep EEG power and memory representational strength change.
figure 5

A and B Total frontal theta (4–7 Hz) power during REM sleep was negatively associated with the item-level strength change (post- minus pre-sleep) and positively associated with the category-level strength change (post- minus pre-sleep). C and D Total frontal beta (15–25 Hz) power during REM sleep was negatively associated with the item-level strength change, and a positive but nonsignificant trend was found between the beta power and category-level strength change. E and F Total frontocentral SO-related (1–1.25 Hz) power during SWS was negatively correlated with the category-level strength change, but not correlated with the item-level strength change. *pFDR < 0.05.

We next examined neural activity during SWS on memory transformation, with a focus on slow oscillation (SO)-related power in the range of 1–1.25 Hz at the Fz channel. This frequency range was selected to minimize slow-drift noise induced EEG power below 1 Hz and aligns with prior findings showing that slow oscillations below 1.25 Hz47. The results revealed that total SO-related power (i.e., SO-related power*SWS amount) was significantly negatively correlated with category-level representational strength change (β = −1.161, adjusted R2 = 0.179, pFDR = 0.014, Fig. 5F), an opposite pattern to that observed for REM theta and beta power. No significant correlation was found for item-level memory strength change (β = 0.122, adjusted R2 = −0.016, pFDR = 0.528, Fig. 5E). Besides, relative SO-related power showed a similar pattern when correlating with category-level representational strength change (Supplementary Fig. 12). In addition, we examined whether delta (1–3 Hz) at the Fz channel and spindle-related sigma (11–16 Hz) power at the Cz channel during SWS were associated with changes in memory representations. No significant correlation was found at either the item-level or category-level (all psFDR > 0.08).

Discussion

Examining neural representations across pre-sleep learning, overnight sleep, and post-sleep mental retrieval sessions, we demonstrated that memory representations of individual items were substantially transformed. From pre-sleep learning to post-sleep retrieval, idiosyncratic item-level representations were abolished while category-level representations remained prominent. Most importantly, we provide compelling evidence that REM and SWS differentially impact memory representational transformation. Specifically, a greater REM sleep to SWS ratio was associated with reduced item-level representational strength, while increased category-level representational strength across sleep.

First, our study advances the understanding of overnight memory representational transformation, extending prior research on neural representational transformation observed within minutes or a few hours during wakefulness39,42,48,49. Previous research has shown that item-level memory representations are evident during both encoding and retrieval sessions within 1–2 h of wakefulness49. Our findings extend these studies by showing that, after overnight sleep, item-level memory representations were no longer evident, while category-level memory representations were persistently prominent from pre-sleep learning to post-sleep retrieval sessions. Importantly, this transformation cannot be explained by general memory decay or retrieval-related task demands, as it was specific to the cue word–picture associations that were successfully retrieved after sleep.

Our results can be well explained by the functional roles of sleep in transforming memory. Specifically, sleep has been proposed to facilitate the abstraction of the gist information (e.g., the concept of “animal”) from individual items within the same category (e.g., different animal pictures)50,51. These gist-like category representations may be more resistant to global synaptic downscaling—a sleep-associated process that enhances neural efficiency by reducing overall synaptic strength52,53. In contrast, synapses encoding fine-grained, item-specific features may be more susceptible to this downscaling, leading to sparser representations that are less likely to be detected using scalp EEG. In line with this account, previous research has suggested that shared conceptual information is more likely to be preserved after sleep, while memory for specific item details tends to weaken25. Consistently, our study found that category-level representations remain prominent and detectable during post-sleep retrieval, whereas item-level neural representations were not detectable, despite participants’ ability to recall individual pictures behaviorally. Moreover, sleep has been shown to enhance cortical neural pattern similarity between different memory items54, suggesting a shift toward more shared and integrated memory representations.

Most critically, our study addressed an under-investigated question: how do SWS and REM sleep contribute to the sleep-based memory transformation1,6,7,9,33. Despite some studies suggesting the interactive functional role of SWS and REM sleep in memory enhancement31,32,55, most sleep research focused on the relationship between a single sleep stage (e.g., either SWS or REM) and behavioral changes, leading to mixed results34,35,56,57,58,59,60. In our study, we systematically examined the complementary and differential roles of SWS and REM sleep in memory representational transformation. Our findings provided evidence that a greater amount of REM sleep relative to SWS is associated with greater memory representational transformation and opposite patterns for REM sleep and SWS power when associated with memory representational transformation. These results support existing theoretical frameworks9,33, which emphasize the critical role of REM sleep in memory representational transformation and the role of SWS in stabilizing memory in its original format. A greater proportion of REM to SWS is associated with a greater shift toward memory representational transformation. Note that the differential roles of REM sleep and SWS may also vary across tasks, such as the emotional memory task61, rule abstract task62, creative problem-solving task63, as well as being affected by the amount of information being learned pre-sleep25, which warrants future research.

Regarding the electrophysiological mechanisms, our results showed that both the REM sleep duration and REM sleep theta and beta power contribute to memory representational transformation. These results are consistent with previous studies showing that longer REM sleep is correlated with enhanced memory for semantic categories64 and facilitated semantic processing65,66. REM duration and theta power have also been linked to better consolidation of schema-congruent memories35, suggesting that REM facilitates the integration of new information into existing semantic networks, leading to more abstract, gist-like representations that resist interference9,67. Beyond theta oscillation, a recent human intracranial EEG study has observed beta oscillations during REM sleep46, which couple with theta activity68. Extending these studies, our findings showed similar functional roles of REM sleep theta and beta activities in memory representational transformation.

In addition, among the post-sleep remembered items, our results found that greater EEG power in the SO band during the SWS was associated with lower post-sleep category-level representational strength compared to pre-sleep. While the two-stage model proposes that memory representations repeatedly reactivated during SWS facilitate the memory transformation into more gist-like representations1,69, our study suggests that the SWS itself may not necessarily facilitate the gist-like representational formation. Instead, it may reflect a process of stabilizing the distinct parts of individual items while reducing the shared representations within the same visual category, which need to be tested in future studies. Moreover, our findings did not contradict previous research, which shows that longer SWS duration is generally associated with better memory retention70,71. When taking memory decay into account, our data similarly revealed that individuals with longer SWS duration exhibited better retention of all pre-sleep tested items (Supplementary Fig. 13).

Notably, although the current study used the same dataset as previous research37, the research objectives were distinct. Our previous study examined whether targeted memory reactivation (TMR) during slow-wave sleep (SWS) elicited item-specific neural representations that predicted post-sleep retrieval performance. In contrast, the present study focuses on how sleep architecture—specifically the interaction between REM and SWS—contributes to the overnight transformation of memory representations at both item- and category-level among successfully remembered items. Furthermore, while previous studies have demonstrated that TMR during SWS can facilitate memory consolidation72,73,74, the present findings complement these studies by showing that TMR does not appear to promote representational transformation during this consolidation process. These findings together may support the hypothesis that memory reactivation during SWS stabilizes but does not transform memory9. Moreover, recent research suggested that TMR during slow-wave sleep had no significant effect on sleep architecture, such as the length of SWS and REM sleep75. However, our study cannot rule out the possibility that TMR that occurs during SWS may trigger memory representations into labile states, which allows memory representations to be transformed during subsequent REM sleep55,76. Future studies should further examine the interactive functional roles of REM and SWS in memory representational transformation during spontaneous overnight sleep.

Regarding the temporal dynamics of memory representations, previous research suggests that visual perceptual details typically emerge earlier than conceptual features during encoding, whereas the reverse pattern tends to appear during retrieval77. Consistent with these findings, our results revealed that during pre-sleep learning, item-specific representations emerged around 500 ms post-stimulus onset, aligning with the early perceptual encoding of visual stimuli78. In contrast, category-level representations appeared later—around 2500–4800 ms post-stimulus onset—during the earlier encoding trials. Interestingly, with repeated exposure, category-level representations were increasingly detected in earlier time windows (~0–1100 ms), indicating that abstract representations may be reactivated more efficiently with repeated learning79,80. Again, in line with prior work77, during the post-sleep retrieval session, category-level representations emerged earlier (~1700–4800 ms post cue onset) than the reduction in item-level representations (~2100–4800 ms). Furthermore, our supplementary analyses showed that the association between the REM/SWS ratio and category-level representational change was strongest at later time windows during the first two retrieval repetitions, but shifted to earlier time windows in subsequent repetitions. This temporal shift suggests that REM/SWS balance may initially support the retrieval of less accessible representations, which become increasingly accessible through repeated retrieval.

Overall, our study demonstrates overnight memory transformation: while memory representations containing both item- and category-level representations during pre-sleep learning, only category-level representations were dominant post-sleep. More importantly, REM sleep and SWS played differential roles in the representational transformation: the greater amount of REM sleep, relative to the SWS, was associated with greater memory representational transformation. These findings advance our understanding of the interactive functional roles of human SWS and REM sleep in memory consolidation and transformation.

Methods

Participants

Thirty-five healthy, right-handed participants were included in the study (26 females, mean age ± SD: 22 ± 2.79 years). Two additional participants who exhibited significant body movements during pre-sleep learning/post-sleep mental retrieval sessions were excluded during initial data visual inspection and screening. Behavioral and wakefulness EEG data analysis were performed on all 35 included participants. However, for the sleep EEG data analysis, one participant was excluded due to disconnected EEG recordings in the middle of overnight sleep, resulting in a final sample size of 34 participants in sleep analyses. All participants were native Chinese speakers. Prior to participation, all participants underwent pre-screening for sleep quality using the Pittsburgh Sleep Quality Index (PSQI) and the Insomnia Severity Index (ISI), ensuring overall good sleep quality. They had not taken any sleep-aid medicines in the past month prior to the experiment. All participants were not diagnosed with any neurological or psychiatric disorders and had normal or corrected-to-normal vision. The study was approved by the Research Ethics Committee of the University of Hong Kong. All participants gave written informed consent prior to participation. All ethical regulations relevant to human research participants were followed. Note that the present study is based on the same dataset as collected in the previous study37, with 28 of the 35 participants overlapping. Participants who were previously excluded due to insufficient TMR trials (n = 7) were included in the present analysis.

Experimental design

The experiment encompassed three primary sessions: (1) a pre-sleep session, including a word–picture associative learning task and pre-sleep memory tests, (2) an overnight sleep session with targeted memory reactivation (TMR) administered during SWS for the initial 3–4 sleeping hours, and (3) a post-sleep session, including a cued-recall test, a post-sleep mental retrieval task with a written report, and a cued-recognition test. All the behavioral tasks were administered using PsychoPy (version: 2020.2.10; https://www.psychopy.org/).

During the pre-sleep word–picture associative learning task, participants were instructed to memorize a total of 96 distinct word–picture pairs. The 96 words were two-character Chinese verbs, while the corresponding pictures were naturalistic images. Each picture fell into one of four categories, namely animals, electronic devices, plants, and transportation tools, with 24 pictures in each category. Each word was randomly paired with a picture for each participant. Each learning trial consisted of three phases: encoding, maintenance, and vividness rating. During encoding, participants were presented with a fixation cross for 0.3 s, followed by a black screen with jittering durations between 0.9 and 1.5 s. Subsequently, a picture was displayed at the center of the screen for 2 s, accompanied by the auditory presentation of the corresponding spoken verb. Participants were explicitly instructed to focus on the picture and memorize the associations between the verbs and the pictures. In the immediately following maintenance period, the presented picture disappeared, and participants were instructed to vividly mentally maintain the picture for a duration of 3 s while hearing the spoken verb again. In the vividness rating phase, participants were required to evaluate the subjective vividness of the mental image they held during the maintenance period on a scale from 1 (not vivid at all) to 4 (very vivid) within 2 s. The entire pre-sleep learning task consisted of three blocks, with each block consisting of 32 distinct verb-picture pairs and each pair repeated three times within a block. The average vividness score was 3.50 ± 0.53 (mean ± SD), indicating strong engagement with vivid item-specific mental imagery. To minimize the potential influence of the recency effect, participants engaged in a ~5-min math task immediately after completing the learning task.

After the distractor math task and a short break (~5 min), half of the pairs (i.e., 48 pairs) were tested via the cued-recall task and the cued-recognition task pre-sleep. In the cued-recall task, each trial started with a 0.3 s fixation, followed by a blank screen (0.9–1.5 s). The spoken verb was played, prompting participants to report whether they “remember” or “forget” the corresponding picture. This stage was self-paced so that participants had enough time to recall. Following the “remember” or “forget” response, participants were asked to report the category of the picture by pressing one of four buttons, with each button representing one of the four categories. In the cued-recognition task, the same half of the pairs were tested. Each recognition trial began with a fixation (0.3 s) and was followed by a blank screen (0.9–1.5 s). Participants were then presented with an auditory cue word and a picture simultaneously. They were then asked to judge whether the image was an exact match to the one originally paired with the auditory cue word by pressing the “Yes” or “No” button. Importantly, to encourage participants to form associative memories between cue words and item-specific pictures—rather than establishing simple associations between cue words and picture categories, or storing cue words and pictures independently, the cued-recognition test was carefully designed to include four types of trials: (1) old trials (the identical word–picture pairs learned before); (2) critical lure trials (cue words paired with lures that closely resemble the target pictures); (3) recombined old trials (cue words paired with pictures from other learned word–picture pairs); (4) recombined lure trials (cue words paired with lures resembling pictures from other learned pairs).

After the pre-sleep test, participants went to sleep from ~12 a.m. to 8 a.m. Targeted memory reactivation cueing was delivered during SWS in the first 3–4 h after participants fell asleep37. All TMR cues were delivered manually by experienced experimenters who monitored EEG signals in real-time. Auditory cues were played only during the identified slow-wave sleep (SWS) stage and stopped immediately when a stage shift or arousal was detected. This conservative approach ensured that cueing occurred only during stable SWS epochs. During the TMR session, only half of the pre-sleep learned pairs were cued during SWS. Memory performance was balanced between TMR cued versus uncued items according to their pre-sleep cued recall performance. Moreover, TMR cues and uncued pairs were balanced in picture categories, with 12 pictures from each of the four categories.

Approximately 30 min after awakening the next morning, participants were tested on all 96 pairs. The post-sleep test included the same cued recall and cued recognition tasks, with an additional mental retrieval task in between. The mental retrieval task was particularly designed to examine the neural representations. Specifically, participants were asked to keep their eyes closed throughout the entire testing block, during which they were asked to mentally retrieve the associated picture as vividly as possible while hearing the auditory verbs, without any explicit behavioral responses. This design aimed to minimize movement-related noise during the EEG recording of mental retrieval. These auditory verbs were randomly played via the speaker with an interstimulus interval (ISI) of 5 ± 0.2 s, comparable to the trial length during pre-sleep learning. Each auditory cue was repeated three times. After the completion of the mental retrieval task, participants received a printed form containing all the cue verbs presented during the mental retrieval task. They were then asked to write down the specific content they retrieved during the mental retrieval task for each cue verb. We used the accuracy of the written report as a rule-of-thumb measure of post-sleep memory performance in recalling specific items. A word–picture pair was considered as “remember” if the written response identified the correct object in the target image, and “forget” if the participant left the item blank or reported an incorrect object.

EEG recording and preprocessing

EEG data were recorded using the amplifier from the eego system (ANT neuro, Netherlands, https://www.ant-neuro.com). Data were sampled at 500 Hz using 64-channel WaveGuard EEG caps, among which 61 channels were mounted in the international 10–20 system, while two channels were placed on the left and right mastoids, and one channel was positioned above the left eye for EOG measurements. During the sleep EEG recordings, two additional channels were placed on both sides of the chin to measure the electromyogram (EMG) using a bipolar reference configuration. EEG data were continuously collected from pre-sleep learning to the next morning, including the overnight sleep. Prior to EEG recordings, impedance levels for all channels were maintained below 20 kΩ. On the morning following sleep, the EEG recording was paused for ~30 min to allow participants to refresh before the post-sleep tests. Before resuming the post-sleep EEG recording, we checked impedance levels for all channels and reapplied conductive gel as needed to ensure impedance remained below 20 kΩ—matching the standards used during the pre-sleep encoding session. These measures were taken to ensure comparable signal quality across pre- and post-sleep sessions. During online EEG recordings, the default reference channel (CPz) was used. Offline preprocessing of the EEG data was conducted using the EEGLAB and Fieldtrip (https://www.fieldtriptoolbox.org/) toolboxes, as well as custom MATLAB code.

Specifically, EEG data were first notch filtered at 50 ± 2 Hz, and then bandpass filtered between 0.5 and 40 Hz. The continuous EEG data during the pre-sleep learning task were segmented into epochs spanning from 3000 ms before until 8000 ms after stimulus onset. This long epoch was used to eliminate the edge effect in the subsequent time–frequency analysis. Our main interesting time windows for the pre-sleep learning data are from 0 to 5000 ms relative to the stimulus onset. Similarly, for the post-sleep mental retrieval data, the continuous EEG data were segmented into epochs spanning from 3000 ms before until 8000 ms after the auditory word onset, with our interesting time windows from 0 to 5000 ms post auditory word onset. Epochs affected by the muscle movements were visually inspected and excluded from further analysis. Eye blinks and movements were corrected using the independent component analysis. Any identified bad channels were interpolated using spherical interpolation in EEGLAB. Subsequently, the EEG data were re-referenced to the average of the artifact-free data across all channels. For both the pre-sleep learning and the post-sleep retrieval data, EEG epochs were categorized into post-sleep remembered or forgotten trials based on the accuracy of the written reporting immediately following the post-sleep mental retrieval.

Sleep scoring

Sleep scoring was conducted on non-overlapping 30-s epochs using the yet another spindle algorithm (YASA), an open-source, machine learning-based toolbox known for its high performance in sleep analysis43. Prior to sleep scoring, bad channels in the EEG data were marked and interpolated. To align with the recommendations of the YASA toolbox, EEG data were re-referenced to FPz. For sleep scoring, the C4 channel, as well as the EOG and EMG channels, were used as inputs to the YASA algorithm. The scoring results were then double-checked and corrected by an experienced sleep researcher to ensure accuracy and reliability.

Time–frequency analysis

For both the pre-sleep learning and post-sleep mental retrieval stages, the epoched EEG data underwent time-frequency analysis using complex Morlet wavelets (six cycles). The frequency range of interest was from 2 to 40 Hz, with a step size of 1 Hz. The time window of interest was from −1 to 5 s relative to the stimulus onset. To obtain the spectral power, the magnitudes of the complex wavelet transform were squared. To account for the 1/f power-law effect, we applied a baseline normalization procedure. Specifically, following the well-established procedure81, for each trial, channel, and frequency bin, we subtracted the mean power during a pre-stimulus baseline window (−0.7 to −0.4 s relative to stimulus onset) and then divided by that same baseline mean. This approach is widely used to attenuate both the 1/f component and task-unrelated baseline activity. The resulting normalized spectral power data were subsequently downsampled to 100 Hz and re-segmented into 5-s epochs, specifically [0–5 s] relative to the stimulus onset. The normalized spectral power within this broad frequency range [2–40 Hz] and epoch duration were used as features for subsequent representational similarity analyses.

Oscillatory power estimation during SWS and REM sleep

For sleep EEG data, we first epoched the continuous sleep EEG into 30-s epochs. To separate the oscillatory power from the 1/f power-law effect (i.e., fractal component), we employed irregular-resampling auto-spectral analysis (IRASA)82. Specifically, for each raw sleep epoch data, IRASA first segmented them into 10 equally sized, partially overlapped segments, with each covering 90% of the epoch. It then computed the power spectral density (PSD) of these segments of the raw data using the fast Fourier transform (FFT) with the function of a Hanning window. Afterward, it irregularly resamples each segment by factors of h (ranging from 1.1 to 1.9 in increments of 0.05) and 1/h. It uses cubic spline interpolation for irregular upsampling and anti-aliasing low-pass filtering, followed by cubic spline interpolation for irregular downsampling. Then the PSD of the resampled data was computed using the same FFT. It then calculated the geometric mean of the auto-power spectra for each h value across upsampled and downsampled signals for each segment. The median of the power spectral with all h-values for each frequency was obtained to estimate the power spectrum of the 1/f power-law effect (i.e., fractal component). We then average the estimated power spectrum of the fractal component and the original signal’s power spectrum across all time segments for each sleep EEG epoch. The oscillatory power of the PSD for individual epochs was then estimated by subtracting the average power spectrum of the fractal component from the PSD of the raw data.

Oscillatory power between 1 and 40 Hz for SWS and for REM sleep was obtained by averaging the oscillatory component across epochs labeled as ‘N3’ (SWS) and ‘REM’, respectively. Sleep oscillations were defined as follows: SO-related band (1–1.25 Hz), delta band (1–3 Hz), theta band (4–7 Hz), sigma band (11–16 Hz), and beta band (15–25 Hz). For REM sleep, frontal theta and beta band power were calculated by averaging the oscillatory power at the F3 and F4 channels45,83,84. For SWS, SO-related and delta band power were calculated by averaging the oscillatory power at the Fz channel, and sigma band power was calculated by averaging the oscillatory power at the Cz channel85,86.

Representational similarity analysis (RSA)

To analyze the neural representations over time, the RSA was performed between the artifact-free trials by correlating the spectral power across frequencies (i.e., 2–40 Hz) and across all scalp channels in sliding time windows. For both the pre-sleep learning and post-sleep retrieval sessions, the length of sliding time windows was 500 ms, with an incremental step size of 100 ms. To increase the signal-to-noise ratio, the spectral power was averaged across time points within each time window, as in previous studies39. This resulted in a set of features consisting of 39 (frequency) by 61 (channel) values for each time window. Then, for each time window, we calculated the similarity between the vectorized features of every two trials using Spearman’s corrections. All the correlation values were Fisher Z-transformed before further statistical analysis.

We categorized representational similarity values into three types: within-item (WI) similarity, within-category (WC) similarity, and between-category (BC) similarity. These categories were based on the corresponding pictures used in the trial pairs for calculating similarity values. WI similarity refers to the similarity between two trials that share the same pictures. WC similarity refers to the similarity between two trials that involve different pictures from the same category. BC similarity refers to the similarity between two trials that involve pictures from different categories. Comparing WI similarity to WC similarity enables us to examine the item-level neural representations, while comparing WC similarity to BC similarity enables us to examine the category-level representations19,20.

The representational similarity was computed either across repetitions within each task session (i.e., within-session RSA) or between two different sessions (i.e., cross-session RSA). Within the pre-sleep learning session, RSA was conducted on distinct learning trials, yielding the Pre–Pre similarity (see Fig. 1). Within the post-sleep mental retrieval session, the RSA was performed between different post-sleep mental retrieval trials, resulting in the Post–Post similarity. Cross-session RSA was conducted between trial pairs, with one trial originating from the pre-sleep learning session and the other from the post-sleep retrieval session. This analysis yielded the Pre–Post similarity. Given that both within-session RSA and cross-session RSA allowed us to compute WI, WC, and BC similarities, we could examine item-level and category-level representations for Pre–Pre similarity, Post–Post similarity and Pre–Post similarity.

Memory representational transformation over sleep was quantified by comparing within-session pre-sleep versus post-sleep similarities, as well as by comparing the within-session versus cross-session representational similarity, following established methods39,42,87. To avoid the confound of the temporal proximity effect—where neural similarity is artificially inflated between time windows that are temporally close—we did not directly compare raw within-session and cross-session similarity values for within-item (WI) or within-category (WC) pairs. Instead, we compared within-session and cross-session representations at the item-level similarity (i.e., WI minus WC), and category-level similarity (i.e., WC minus BC). Given that WI, WC, and BC similarities were all calculated from trial pairs with comparable temporal distances in both within-session and cross-session RSA, the WI minus WC for item-level and the WC minus BC for category-level representations eliminate temporal proximity effects, both within and across sessions.

Correlation analysis between memory representational strength changes and REM/SWS (or SWS*REM)

To quantify representational strength change over sleep, we first computed the mean item-level and category-level representational strength at pre-sleep learning session by averaging all values from the two-dimensional Pre–Pre WI-minus-WC (i.e., item-level) and WC-minus-BC (i.e., category-level) matrices, respectively. We then subtracted the mean strength of pre-sleep item-level or category-level representations from each cell of the corresponding two-dimensional similarity matrix for post-sleep item-level or category-level representations. This resulted in two-dimensional matrices, capturing time-resolved item-level and category-level representational strength change, respectively. Meanwhile, we obtained the REM/SWS ratio by dividing the REM% (i.e., the percentage of REM sleep in total sleep time) by the SWS% (i.e., the percentage of SWS in total sleep time). The SWS*REM (%*%) index was obtained by multiplying the SWS% by the REM%. We then computed Spearman’s correlations between the REM/SWS ratio (or SWS*REM) and each cell of the two-dimensional representational change matrices across participants. These analyses resulted in correlation coefficient matrices that mapped the temporal windows during which item- and category-level changes were significantly associated with the REM/SWS ratio (or SWS*REM). These correlation coefficient matrices were then Fisher Z-transformed before further analysis.

Statistics and reproducibility

Behavioral comparisons between TMR-cued and uncued items, as well as between pre-sleep tested and untested items, were conducted using paired-sample t-tests implemented in MATLAB. For RSA, Spearman’s rank correlation was used to compute the similarity between neural patterns. To statistically compare conditions (e.g., WI vs. WC or WC vs. BC) across consecutive time windows, we employed non-parametric cluster-based permutation tests88. Specifically, we first calculated the differences between conditions (e.g., WI vs. WC or WC vs. BC) within individual time windows. To assess the significance of the differences between conditions, we generated a null distribution by randomly permuting condition labels 1000 times for individual participants and recalculating the condition differences for each permutation. The empirical differences were then compared to this null distribution to identify significant time windows, which showed that empirical differences exceeded the null distribution threshold (α = 0.05). Adjacent significant time windows (p < 0.05) were grouped into clusters. For each permutation, we also identified the maximum cluster size and compared the size of empirical clusters to the surrogate clusters from the 1000 permutations. If no significant cluster was observed in a permutation, a value of 0 was assigned. Non-parametric cluster-level significance was determined by the proportion of cluster sizes in the null distribution that exceeded the empirical cluster size.

For correlations between sleep metrics and RSA measures across consecutive time windows, we also applied non-parametric cluster-based permutation tests. Briefly, the empirical cluster-level statistics were obtained by summing the transformed correlation coefficients across adjacent time windows with significant correlation (i.e., p < 0.05). The null distribution of cluster-level statistics was obtained by shuffling the order of participants for REM/SWS (or SWS*REM) 1000 times while keeping the order of participants for the representational transformation index matrix unchanged. For each shuffling, the same correlation analysis was conducted, and the maximum cluster-level statistic was identified. The non-parametric significance of a cluster was determined by calculating the proportion of cluster-level statistics in the null distribution exceeding the empirical cluster-level statistic. After identifying a significant cluster, post-hoc analyses examining the relationship between sleep architecture and representational strength changes were conducted using robust linear regression. Multiple comparisons were corrected using the false discovery rate (FDR) method.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.