Introduction

There are many anecdotes of artists and scientists having a sudden breakthrough on a problem or creative work they have temporarily put aside. A large body of studies has suggested that taking breaks from creative problem solving can facilitate significant improvements in subsequent performance compared to working continuously1. These breaks, during which active work on a problem is temporarily ceased allowing for either rest or engagement in an unrelated task, are often referred to as incubation periods. While incubation periods have been shown to benefit work on various types of creativity problems, including “insight” problems which require a novel perspective to solve and typically have a single correct answer, incubation has particularly strong benefits for work on divergent thinking problems, or open-ended problems that allow for many solutions1.

Various psychological processes may account for these incubation-related performance improvements. For instance, these benefits could be caused by cognitive enhancements from a reduction in mental fatigue or by the continued opportunity to deliberately think about the problem during the break2 – known as the “conscious-work hypothesis”1. If the conscious-work hypothesis is correct, then an incubation period during which people are allowed to rest should yield the greatest subsequent performance benefit, as rest would be expected to maximize reductions in mental fatigue and minimize distraction allowing for continued problem solving. Alternatively, some researchers have proposed that incubation benefits are due to gradual, undirected processes that occur during the incubation of the problem representation – known as the “unconscious-work hypothesis”1,3,4. These inadvertent mental processes may include problem representation restructuring5, weakening activation of irrelevant concepts4, or spreading activation to associated relevant concepts6,7 which may occur when individuals allow their minds to wander in an undirected manner8. If the unconscious-work hypothesis is correct, then an incubation period during which people shift their attention away from the problem should yield the greatest subsequent performance benefit, as this distraction may facilitate the deactivation or forgetting of misleading knowledge and unhelpful self-imposed constraints, as well as the spontaneous activation of relevant knowledge through mind wandering, or task-unrelated thought.

Several studies have provided evidence for the primacy of conscious thought in accounting for incubation benefits (i.e., conscious-work hypothesis). Browne and Cruse2 found that, while taking a 5-min break from working on a geometric insight problem, participants who were told to relax during the incubation period were more successful after the break and reported thinking about the problem more often compared to participants who were given a distractor task during the break. Medd and Houtz9 instructed fourth grade students to engage in creative story writing separated by a 10-min incubation period. Students who were instructed to actively think about their stories during the break wrote stories that were judged as more creative than those written by students who were not given the instructions. These studies suggest that incubation benefits may, at least in part, be due to continued deliberate problem solving during the break.

On the other hand, a number of studies have supported the notion that incubation benefits are not primarily the result of deliberate thought during incubation and may actually be weakened by explicit problem-solving attempts (i.e., unconscious-work hypothesis). Schooler et al.10 found that verbalizing conscious problem-solving strategies during a break impaired participants’ performance on insight problems compared to participants who did not verbalize their strategies, potentially suggesting an advantage of implicit processes for problem solving. Dijksterhuis11 found that distraction during short incubation periods led to better subsequent decision making compared to an incubation period that allowed for conscious problem solving. Similarly, the meta-analysis conducted by Sio and Ormerod1 found that incubation periods involving undemanding tasks yielded greater post-incubation benefits to linguistic insight problems than mere rest. Presumably, engaging in an unrelated task during incubation interferes with explicit problem solving compared to rest, thus these findings support the unconscious-work hypothesis. This meta-analysis also found that an incubation task containing misleading cues, which may facilitate the forgetting of irrelevant problem information, yielded a similar post-incubation benefit, suggesting a potential advantage of undirected cognitive processes during incubation.

Additional studies suggest that incubation benefits could be driven by undirected associative processing. For instance, following exposure to insight problems, REM-stage sleep has been shown to benefit subsequent performance compared to rest or NREM-stage sleep12,13,14, likely due to REM-related associative network activity15,16,17. It has also been suggested that spontaneous task-unrelated thought, or mind wandering, during wakefulness may contribute to incubation benefits. Mind wandering has been linked to insight18 and “aha” experiences19, though evidence is mixed as to whether mind wandering frequency predicts insight problem-solving performance and creativity19,20,21,22. Baird et al.23 examined the effects of an incubation period on divergent thinking problems and found that engaging in an undemanding task during incubation led to greater creative improvement than a demanding task. The undemanding task also induced more mind wandering without increasing conscious thought about the problems, supporting the unconscious-work hypothesis and suggesting a beneficial role of mind wandering in creative incubation. However, these results did not replicate24 and similar studies failed to find a relationship between incubation mind wandering and creative improvement25,26. Therefore, the role of mind wandering in incubation-related creative benefits remains unclear.

Purpose of the present study

The present preregistered (https://osf.io/cxm89) study investigated whether mind wandering is one mechanism contributing to incubation-related effects on creativity. Because commonly used tests of divergent thinking do not fully reflect real-world creative behavior – particularly the synthesis and integration of ideas into creative products – the present study employed a creative writing task in which participants wrote fictional short stories based on open-ended prompts. During a break between writing sessions, participants engaged in one of several unrelated tasks each designed to facilitate a specific degree of mind wandering. Research has shown that mind wandering can be acutely dampened both by tasks requiring substantial cognitive resources such as working memory tasks23,24,27,28,29 and by tasks that promote mindfulness such as mindful breathing30,31. Thus, the present study utilized working memory and mindfulness tasks to test the possibility that incubation periods that promote mind wandering (a low-cognitive load 0-back task) enhance subsequent creative thinking and incubation periods that discourage mind wandering (a high-cognitive load 2-back task and mindfulness meditation) diminish subsequent creative thinking. Alternatively, while evidence is mixed8,32,33,34,35,36,37,38,39,40, it is possible that mindfulness in particular could improve creativity by a mechanism unrelated to mind wandering, such as diminishing rigid or habitual thought patterns41,42, allowing access to more distant connections or the possibility of problem representation restructuring. For instance, Kudesia et al.8 found that a mindful incubation period led to a more distant search in a divergent thinking task. Still, the psychological mechanisms through which mindfulness impacts creative incubation have not been thoroughly explored, in part due to the limited use of validated assessments of mental content during incubation. The present study used a validated questionnaire43 assessing incubation thought content (e.g., mind wandering) during the break. After the break, some participants wrote another story based on the same prompt while others received a different prompt, in order to distinguish whether incubation effects stem from incubating story information or fostering a general mental state conducive to creative writing. We hypothesized that (H1) participants would show greater creative improvement following breaks designed to promote mind wandering (0-back task) than following breaks designed to discourage mind wandering (2-back task, mindfulness, and no break), and these effects would only be observed in participants who received a repeated prompt after the break.

In addition to elucidating how various incubation tasks affect mind wandering and creativity, the present study investigated whether the cognitive processes that contribute to incubation effects are goal-oriented and intentional or less directed. The relative contribution of goal-directed and undirected thought in incubation effects is unclear2,9,23,44,45, and research has shown that even mind wandering can be either intentional or unintentional46,47. In fact, propensity for intentional and unintentional mind wandering has shown differential effects on creativity48, though evidence is mixed22. In the present study, the contribution of top-down processes on incubation effects was tested by manipulating participants’ expectations of future problem solving during the incubation period. Specifically, some participants were informed about the post-incubation writing task before the incubation period, providing them with an implicit goal, while others were not told in advance. We hypothesized that (H2) participants informed about the second writing task would show increased mind wandering and creative improvement compared to participants who were not told in advance.

We also tested the relationship between mind wandering and creative improvement directly. We predicted that (H3) mind wandering during incubation would correlate with creative improvement in participants who received a repeated prompt post-incubation.

Methods

Hypotheses, design, recruitment and experimental procedures, exclusion criteria, and statistical analyses were preregistered on Open Science Framework (https://osf.io/cxm89).

Participants

Two hundred participants (89 female, 111 male; age M = 37.8, SD = 12.5) were recruited on the Prolific website between September 2022 and January 2023 and compensated $8 for completing the study. Participants were all native English-speaking adults in the United States. The study was approved by the University of Southern California Institutional Review Board and was conducted in accordance with all relevant guidelines and regulations. Informed consent was obtained from all participants before they completed the study.

Materials

During recruitment, participants were informed they would need to complete the study in one approximately 40-min sitting on their desktop computers. The study was conducted on the Qualtrics website, except for the incubation tasks which were run using the PsychoPy49 JavaScript (PsychoJS) library on a website automatically redirected from Qualtrics. Participants were prevented from opening the survey if Qualtrics detected they were using a mobile device.

Demographics and trait questionnaires

Participants were asked to indicate their age, gender, whether they were red-green color-blind, if they had ever practiced mindfulness before, and, if so, how often and when they began this practice. Previous achievement in creative writing was assessed with the Creative Writing component of the Creative Achievement Questionnaire50. Big Five personality traits were assessed with the Ten-Item Personality Inventory51,52. Dispositional mind wandering was assessed with the Mind Wandering Questionnaire53. Dispositional mindfulness was measured with the 15-item Five Facet Mindfulness Questionnaire54,55, with the Observing subscale items omitted as suggested by Gu et al.56. Items within each questionnaire were presented in random order to minimize order effects. These questionnaires were used for descriptive purposes and were not used to determine study inclusion.

Creative writing task

During the pre-incubation writing session, participants were shown an instructions screen informing them that they would spend 10 min writing a fictional short story based on a given prompt. The instructions stated that they could take the story in any direction they liked, provided it was based on the prompt, and that they could use the text box for brainstorming and planning but should include only their finished story by the end of the 10 min. For motivational reasons, participants were informed they would be emailed a copy of their story. When participants clicked to the next screen, they were provided with the writing prompt, a text box for typing their story, and a timer displaying the number of seconds remaining. The page displayed without a “next” button and automatically advanced to the next part of the experiment after 10 min.

The post-incubation writing session also lasted 10 min and followed the same format as the pre-incubation session, except that the instructions screen included a timer and auto-advanced after 1 min to keep the length of the break between the two writing sessions consistent and to prevent a break in the no-break condition. Additionally, participants who were assigned the same prompt were informed that they could use their first story for inspiration, but should start from the beginning rather than continue where they left off, and should not assume that the reader will have seen their earlier story.

Two different prompts were used in this study. Prompt A was created by Zedelius et al.57 and is written below:

Create a character who has suddenly and unexpectedly attained some sort of power. In the wider perception of the world the level of authority may be small or great, but for this person, the change is dramatic. Write about the moment in which your character truly understands the full extent of his or her newfound power for the first time.

Prompt B was found on the website Reddit and is written below:

When you die, you appear in a cinema with a number of other people who look like you. You find out that they are your previous reincarnations, and soon you all begin watching your next life on the big screen.

The two prompts were counterbalanced across all conditions such that each appeared an equal number of times within each unique combination of experimental conditions (e.g., Prompt A and Prompt B were presented as the pre-incubation prompt to an equal number of participants in the “repeated prompt” and “new prompt” conditions, as well as across incubation task and goal conditions). This ensured that any prompt-related differences would not systematically bias our analyses. See Supplementary Table S1 for statistical comparisons between prompts.

Incubation tasks

Participants assigned to receive a break between the two writing sessions completed one of the following tasks during the break.

0-Back task

In the 0-back task, numerical digits were presented one at a time. In 10% of trials, a question mark appeared below the number. Each time a question mark was presented, participants were to indicate whether the current number was even (by pressing the “e” key) or odd (by pressing the “o” key). Each trial lasted 2 s and the interstimulus interval was 1 s. Participants were only allowed to respond while the stimuli were displayed and not during the interstimulus interval. The question mark turned green following correct responses and red for incorrect responses. The task lasted 10 min, for a total of 200 trials. This cognitively undemanding task was intended to facilitate mind wandering, as has been previously shown23,24,29, given that target trials were infrequent and participants did not need to pay attention to the numbers during non-target trials (i.e., no working memory load).

2-Back task

The 2-back task was the same as the 0-back task except that whenever a question mark was presented, participants indicated whether the number two positions back in the sequence was even or odd. The 2-back task used the same number sequence and target trials to keep the stimuli consistent across these two conditions. This cognitively demanding task was intended to reduce participants’ degree of mind wandering, as has been previously shown with a similar task23,24,29, given that performance requires the constant maintenance and updating of information in working memory.

Mindfulness meditation

In the mindfulness meditation condition, participants were instructed to listen to a 10-min guided-audio meditation, following the spoken instructions as best they can with their eyes either open or closed. Participants were instructed to press the spacebar each time they became distracted by their thoughts. The meditation track (found at https://www.youtube.com/watch?v=tw7XBKhZJh4 with additional silence added between instructions to achieve the 10-min duration) was selected because it provided instructions specific enough to be accessible to participants without prior meditation experience, and included both attention-monitoring and acceptance training components, which have in combination been shown to reduce mind wandering more than attention monitoring alone58. The guided meditation instructed participants to nonjudgmentally notice the arising and passing of the breath and bodily sensations, sounds, moods, and thoughts, and to gently return their attention to one of these sense objects each time they noticed they were lost in thought. A meditation bell played at a comfortable volume indicated the end of the audio track.

Thought content questionnaires

Participants who were assigned to one of the 10-min incubation tasks completed questionnaires retrospectively assessing thought content during the incubation period. Participants completed the Thinking Content component of the Dundee Stress State Questionnaire (DSSQ)43, which contains items measuring task-related (e.g., “I thought about how much time I had left.”) and task-irrelevant (e.g., “I thought about something that made me feel guilty.”) cognitive interference. One item (“I thought about the difficulty of the problems”) was modified (“I thought about the difficulty of the task”) so it could apply to the mindfulness task in addition to the n-back tasks. Items were presented in random order to minimize order effects. In addition to the DSSQ, participants were asked to assess the degree to which they consciously thought about their written story during the break on a sliding scale (0 = none at all to 100 = the entire time).

Procedure and design

The study consisted of one approximately 40-min session, conducted on participants’ Internet browser. Following consent procedures, participants were asked to quit and turn off applications and devices that may distract them and then completed the demographics and trait questionnaires.

Participants were randomly assigned to conditions by the Qualtrics randomizer. Three variables were manipulated between subjects: (a) incubation task (i.e., activity between the first and second writing session: 0-back task, 2-back task, mindfulness meditation, or no break), (b) incubation goal (i.e., whether participants were informed before the break that they would write again based on their prompt: goal or no goal), and (c) prior exposure (i.e., whether the post-incubation writing prompt matched the pre-incubation prompt: repeated prompt or new prompt). Since the presence of an incubation goal is only applicable to incubation periods, this variable was not manipulated for participants in the “no break” condition. To avoid deception, participants in the “goal” condition were always assigned to receive a repeated prompt after the break. Group assignment used a block randomization procedure, assigning each participant to one of the 11 possible variable combinations of equal size. This study design is depicted in Fig. 1.

Fig. 1
figure 1

Graphical overview of the study design. Darker boxes represent each stage of the study procedure (ordered from left to right), while lighter boxes indicate the between-subjects conditions for each stage.

First, participants engaged in the pre-incubation writing task for 10 min. Following this session, participants assigned to one of the three 10-min incubation tasks were presented with a general instructions screen for the upcoming incubation period. On this screen, participants assigned to the “goal” condition were told they would continue writing based on their prompt after a break (“Time for a break! You will continue writing based on this prompt after the break. During this break, you will engage in an attention task.”) while participants in the “no goal” condition were not told in advance (“Thank you for participating in this creative writing part of the study. You will now engage in an attention task.”). Participants were required to click a checkbox that reiterated this message and indicated that they would try to complete the given task to the best of their ability. Participants were then presented with instructions for their assigned incubation task (0-back, 2-back, or mindfulness) and subsequently engaged in the task for 10 min. Immediately afterwards, they completed the thought content questionnaires and also indicated how often during the break they engaged in something unrelated to the study such as checking their phone (never, once, several times, a lot, or the entire task). They were told that their response would not affect their study compensation. Participants assigned to the “no break” condition were presented with the instructions screen for the second writing session immediately after writing their first story.

Next, all participants engaged in the post-incubation writing task for 10 min. Some participants received the same prompt as their pre-incubation writing session, while others received a new prompt. Finally, participants were asked to rate how creative they thought each of their stories were on a sliding scale (0 = not creative at all to 100 = very creative). Participants in the meditation condition were also asked whether they recognized the voice from the guided meditation and, if so, to indicate the name of the speaker.

Analysis measures

Incubation thought content measures

Mind wandering during the incubation period was assessed by the task-irrelevant cognitive interference index from the DSSQ43, which has been used to assess mind wandering in previous studies23,59. We also calculated the task-related cognitive interference index from the DSSQ and the degree of story-related thought from the sliding scale.

Creative improvement measures

As our primary outcome measure, we calculated the within-subject percent change in creativity from participants’ pre-incubation story to their post-incubation story. Specifically, creative improvement was quantified within-subject as follows: (CreativityPost – CreativityPre)/CreativityPre × 100. Story creativity was assessed using both human ratings and computational methods.

Human ratings

We obtained human ratings of story creativity following the consensual assessment technique60, which has shown high levels of interrater reliability for both expert and quasi-expert ratings of writing samples61,62,63. Participants’ stories were each evaluated by three graduate students in University of Southern California’s Creative Writing and Literature PhD program working independently using their tacit knowledge from their experience in the field, without explicit criteria or training by the researchers. Each story was presented alongside the writing prompt, and judges rated the (a) story creativity and (b) writing quality (i.e., technical execution) on a six-level Likert scale (scales with five or less levels have shown reduced precision64) or alternatively flagged the story if it was incomprehensible or irrelevant to the prompt. Judges were instructed to rate the stories relative to the others; thus, for ease of comparison across stories, judges saw all stories from one writing prompt followed by all stories from the other prompt. Stories were presented in different random orders within each prompt and prompt order was randomized across judges. Judges were blind to experimental condition and to which stories were written before or after the incubation period. Interrater reliability for creativity ratings was calculated using Cronbach’s alpha after applying relevant exclusion criteria, and was found to be α = 0.72. For each story, creativity was quantified as the average rating across judges.

Forward flow

While the consensual assessment technique has high face validity and construct validity, there is a lack of systematic study of the optimal implementation of this method65. Further, given that judges use their own tacit rating criteria for creativity, results may not be reproducible. Considering these issues, the present study also used an automated method to assess creativity based on semantic distance. The relationship between semantic distance and creativity is intuitive: the farther one “moves away” from more related or conventional ideas in semantic space, the more likely that idea is to be creative66. Studies have shown that automated semantic distance-based scoring of divergent thinking tasks strongly and reliably predicts human scoring67,68,69,70.

Story creativity was assessed using a measure proposed by Gray et al.71 termed forward flow, which quantifies the average degree to which each new idea semantically diverges from previous ideas. Originally developed for a stream-of-consciousness writing task, forward flow has been shown to predict individual differences in both divergent thinking performance and real-world creative behavior71,72. Applying forward flow to quantify narrative creativity may offer advantages over other semantic distance-based approaches – such as divergent semantic integration (DSI), the average semantic distance between all word pairs in a narrative73 – because forward flow accounts for the order in which ideas are introduced, allowing for sensitivity to how a story unfolds over time. To ensure the robustness of our findings, we also replicated our key analyses using DSI, reported in Supplementary Table S2.

To preprocess participants’ stories, we converted text to lowercase, then used the tm package74 (version 0.7.8) in R to remove numbers, punctuation, and stop words. Removing stop words has been shown to remove bias in automated divergent thinking scoring75 and could remove noise from the semantic analysis since it reduces stories to their main ideas. The tutorial code from Johnson et al.73, which calculates average semantic distance between all word pairs, was modified to instead calculate forward flow as follows. Semantic similarity between all word pairs was calculated by taking the cosine angle between word vectors, generated using the provided corpus from Touchstone Applied Science Associates (TASA), the same corpus used in the studies in Gray et al.71. Taking the inverse of these values yielded semantic distance values for all word pairs. Following Gray et al.71, forward flow was calculated by computing the average semantic distance between each word and all preceding words in the story (instantaneous forward flow) and then averaging these values across the story. The resulting forward flow score represents the average degree to which each new idea semantically diverges from previous ideas in the story. Figure 2 shows the formula for forward flow and a schematic of how it was applied to an example story.

Fig. 2
figure 2

Schematic of forward flow calculation. (a) Example story fragment used for forward flow analysis. Each arrow represents the semantic distance Di,j between word i and a preceding word j. Gray words are stop words removed during preprocessing. (b) Formula for computing forward flow71. For each word i, its instantaneous forward flow is calculated as the average semantic distance to all previous words (arrow values within the same color). The overall forward flow of the story is then determined by averaging these instantaneous values.

Given that including the writing prompt when calculating a story’s average forward flow would bias this value towards the average forward flow of the writing prompt itself, we removed the writing prompt when calculating forward flow. Since a high forward flow score may be achieved by writing random words, one of our exclusion criteria was story incomprehensibility (see Exclusion Criteria section for details). To verify the validity of forward flow for measuring creativity, we tested the correlation between human-rated creativity and forward flow after applying relevant exclusion criteria, and found it to be significant, r(382) = 0.51, p < 0.001.

GPT-4 ratings

As an exploratory method, we utilized GPT-476 to score story creativity. GPT-4 has shown strong performance in creativity scoring in prior research77. We provided the GPT-4 API with the same instructions that were provided to the human raters:

Below is a story that was written in response to the prompt. Rate the (1) story creativity and (2) writing quality on a scale from 1 to 6 (whole numbers). If the story is incomprehensible or irrelevant to the prompt, rate it 0.

These instructions were followed by the story prompt and written story (no preprocessing). We set the model temperature to 0 to maximize reproducibility. The correlation between creativity scores from GPT-4 and human ratings was found to be significant (r(382) = 0.42, p < 0.001) as well as the correlation between GPT-4 and forward flow (r(382) = 0.42, p < 0.001).

Statistical analysis

Exclusion criteria

Participants were excluded if the majority of human raters flagged one of their written stories as incomprehensible or irrelevant to the prompt (seven participants). One participant was excluded due to technical difficulties with the experiment website resulting in exposure to both prompts before the break. For analyses comparing incubation tasks, we excluded participants based on insufficient task performance as indicated by failure to reach response and accuracy (excluding nonresponses) thresholds of 60% during the 0-back (one participant) or 2-back (14 participants, including one whose story was flagged as incomprehensible/irrelevant) tasks, not pressing the spacebar at least once during the meditation task (six participants), or reporting that they engaged in something unrelated to the study more than several times during the incubation task (one participant); one additional participant was excluded due to a technical problem with the experiment website resulting in loss of meditation task data.

Effects on incubation thought content

To test whether the incubation tasks and incubation goal manipulated mind wandering in the intended directions, we used a two-way analysis of variance (ANOVA, type III sums of squares) with incubation task (0-back, 2-back, mindfulness) and incubation goal (goal, no goal) as the between-subject factors, and mind wandering as the dependent variable. To test how these manipulations affected other types of thought content, this analysis was repeated with task-related cognitive interference and conscious story-related thought as dependent variables. Statistical analyses were conducted in R using the stats and car packages, and post hoc pairwise comparisons were conducted using the emmeans package with Tukey-adjusted p-values.

Effects on creative improvement

To test how the incubation tasks affected creative improvement (H1), we used a two-way ANOVA with incubation task (0-back, 2-back, mindfulness, no break) and prior exposure (repeated prompt, new prompt) as the between-subject factors, and creative improvement as the dependent variable. Since participants in the “goal” condition always received a repeated prompt after the break, this analysis omitted participants in the “goal” condition such that prior exposure was not confounded with incubation goal.

To test how the presence of an implicit incubation goal affected creative improvement (H2) and how it interacted with incubation task, we used a two-way ANOVA with incubation goal (goal, no goal) and incubation task (0-back, 2-back, mindfulness) as the between-subject factors, and creative improvement as the dependent variable. Since participants in the “goal” condition always received a repeated prompt after the break, this analysis omitted participants in the “new prompt” condition such that incubation goal was not confounded with prior exposure. Note that since incubation goal was not manipulated in the “no break” condition, this analysis omitted participants in this condition. To rule out baseline differences in creativity, which could influence the potential for creativity improvement, we confirmed that pre-incubation creativity scores did not differ significantly across conditions (see Supplementary Table S3). Post hoc sensitivity analyses for these hypothesis tests were conducted using G*Power 3.178 and are reported in Supplementary Table S4.

Relationship between mind wandering and creative improvement

To test whether mind wandering during the incubation period predicted creative improvement in participants who received a repeated prompt (H3), we used least-squares linear regression with mind wandering as the predictor and creative improvement as the dependent variable, conducted with participants in the “repeated prompt” condition. To assess the specificity of this effect, we followed up significant findings by testing whether the relationship held in the “new prompt” condition, whether other types of incubation thought content predicted improvement, whether the effect remained significant when controlling for dispositional mind wandering, and whether mind wandering was unrelated to baseline (i.e., pre-incubation) creativity.

Results

Effects on incubation thought content

Figure 3 shows experimental effects on mind wandering, task-related cognitive interference, and story-related thought during the incubation period. We found a significant main effect of incubation task on mind wandering, F(2, 129) = 10.8, p < 0.001. Post hoc comparisons revealed significantly more mind wandering during the mindfulness meditation than during the 0-back (t(129) = 3.01, p = 0.009) and 2-back (t(129) = 4.56, p < 0.001) tasks, while there was no significant difference between the 0-back and 2-back tasks, t(129) = 1.79, p = 0.177. There was no significant effect of incubation goal (F(1, 129) = 0.50, p = 0.482) and no interaction between incubation task and incubation goal (F(2, 129) = 0.94, p = 0.394) on mind wandering. We found a significant main effect of incubation task on task-related cognitive interference, F(2, 129) = 4.04, p = 0.020. Post hoc comparisons revealed significantly more task-related cognitive interference in the 2-back task than the mindfulness condition (t(129) = 2.80, p = 0.016), but no significant differences between the 0-back and 2-back (t(129) = 1.16, p = 0.479) or mindfulness (t(129) = 1.79, p = 0.178) conditions. There was no significant effect of incubation goal (F(1, 129) = 1.36, p = 0.246) and no interaction between incubation task and incubation goal (F(2, 129) = 0.14, p = 0.868) on task-related cognitive interference. There were no significant effects of task (F(2, 129) = 0.15, p = 0.857), incubation goal (F(1, 129) = 0.04, p = 0.848), or their interaction (F(2, 129) = 2.99, p = 0.054) on story-related thought.

Fig. 3
figure 3

Effects of incubation task and incubation goal on three types of thought content: (a) mind wandering, (b) task-related cognitive interference, and (c) story-related thought. Error bars represent the standard error of the mean. *p <.05. **p <.01. ***p <.001. p-values are adjusted for multiple comparisons using the Tukey method. In panels (a) and (b), possible values ranged from 1 to 5; in panel (c), possible values ranged from 0 to 100.

Effects on creative improvement

Using human-scored creativity, we found no significant effects of incubation task (F(3, 117) = 0.17, p = 0.916), prior exposure (F(1, 117) = 0.71, p = 0.403), or their interaction (F(3, 117) = 0.13, p = 0.943) on creative improvement. Using the forward flow measure, we found a significant main effect of prior exposure (F(1, 117) = 4.47, p = 0.037), where participants receiving a repeated prompt showed more creative improvement than participants receiving a new prompt. There was no effect of incubation task (F(3, 117) = 1.00, p = 0.396) and no interaction between incubation task and prior exposure (F(3, 117) = 0.15, p = 0.930).

Using human-scored creativity, we found no significant effects of incubation goal (F(1, 86) = 0.00, p = 0.988), incubation task (F(2, 86) = 0.19, p = 0.829), or their interaction (F(2, 86) = 0.09, p = 0.919) on creative improvement. Likewise, there were no significant effects of incubation goal (F(1, 86) = 1.64, p = 0.204), incubation task (F(2, 86) = 0.04, p = 0.964), or their interaction (F(2, 86) = 0.86, p = 0.426) on creative improvement using forward flow.

Relationship between mind wandering and creative improvement

Using human-scored creativity, mind wandering during incubation did not significantly predict creative improvement in the “repeated prompt” condition (β = −1.39, SE = 5.29, p = 0.793). However, using forward flow, higher levels of incubation mind wandering significantly predicted greater creative improvement in the “repeated prompt” condition (β = 0.33, SE = 0.14, p = 0.024). To test whether mind wandering benefits were specific to story incubation (i.e., prior exposure to the prompt), we repeated this analysis in the “new prompt” control condition. We found that mind wandering did not predict creative improvement in the “new prompt” condition (β = −0.18, SE = 0.26, p = 0.495). Figure 4 shows the relationship between mind wandering and creative improvement in the “repeated prompt” and “new prompt” conditions using the forward flow measure.

Fig. 4
figure 4

Relationship between incubation mind wandering and creative improvement, as measured by change in forward flow. (a) Participants who received the same prompt after incubation (i.e., prior exposure) showed a significant positive association between mind wandering and forward flow change (p =.024). (b) Participants who received a new prompt after incubation (i.e., no prior exposure) showed no significant association (p =.495).

To test whether this relationship was specific to mind wandering thought content, we tested whether task-related cognitive interference and conscious story-related thought were related to creative improvement from forward flow. We found that creative improvement was not significantly predicted by explicit thought about the story (β = 0.00, SE = 0.00, p = 0.408) or task-related cognitive interference (β = 0.19, SE = 0.13, p = 0.136) in the “repeated prompt” condition. To verify that this relationship was not merely explained by individual differences in the tendency to mind wander, we repeated the original analysis with dispositional mind wandering included as a covariate, and found that the relationship between incubation mind wandering and creative improvement still maintained significance (β = 0.38, SE = 0.15, p = 0.012). Finally, we confirmed that mind wandering was not significantly related to pre-incubation forward flow scores (β = −0.00, SE = 0.00, p = 0.246). All of these analyses were replicated using DSI, a previously validated semantic distance-based measure of narrative creativity73, and yielded similar results (see Supplementary Table S2).

In an exploratory analysis, we tested this relationship using creativity ratings scored by GPT-4 instead of forward flow. Consistent with previous results, higher levels of incubation mind wandering significantly predicted greater creative improvement in the “repeated prompt” condition (β = 6.76, SE = 3.18, p = 0.036) but not the “new prompt” condition (β = −5.75, SE = 7.25, p = 0.432). Explicit thought about the story (β = 0.08, SE = 0.10, p = 0.392) and task-related cognitive interference (β = −2.77, SE = 2.86, p = 0.335) did not significantly predict creative improvement from GPT-4 ratings in the “repeated prompt” condition. The relationship between mind wandering and creative improvement from GPT-4 ratings also remained significant after controlling for dispositional mind wandering (β = 7.11, SE = 3.32, p = 0.035). We confirmed there was no significant relationship between mind wandering and pre-incubation GPT-4 scores (β = −0.18, SE = 0.14, p = 0.198).

Discussion

In this study, we aimed to manipulate mind wandering through various tasks during an incubation period in order to examine whether these tasks would facilitate corresponding changes in creativity. While we observed task-dependent shifts in mind wandering, albeit in unexpected directions, we did not observe task-dependent changes in creativity. We then tested the association between mind wandering during incubation and within-subject creative improvement using several methods of creativity assessment. While no significant results emerged using human-scored creativity ratings, we found a significant positive relationship between incubation mind wandering and creative improvement using a semantic distance-based measure and GPT-4. The relationship between mind wandering and creative improvement was specific to the incubation condition (repeated prompt) rather than a control condition (new prompt), was specific to mind wandering rather than two other types of thought content, including explicit thought about the story, and remained significant after controlling for dispositional mind wandering.

Our hypothesis (H1) that participants would show greater creative improvement following breaks designed to promote mind wandering, particularly after prior exposure to the writing prompt, was not supported. First, our manipulation did not affect mind wandering in the intended directions. Based on prior research23,24,27,28,29,30,31, we expected that both a cognitively demanding (2-back) task and mindfulness meditation would dampen mind wandering in comparison to a cognitively undemanding (0-back) task. Contrary to our expectations, mindfulness meditation resulted in the highest levels of reported mind wandering. This finding may be attributed to participants’ limited prior meditation experience, as research suggests that the effects of mindfulness on mind wandering differs between novice and expert meditators79,80. Mindfulness may have also heightened participants’ awareness of mind wandering leading to higher self-reported levels. No significant differences in mind wandering were observed between the 0-back and 2-back tasks, despite prior research showing very similar working memory tasks’ influence on mind wandering23,24,29. Given that mind wandering levels across these two tasks followed the expected trends, it is possible that the present study lacked the statistical power to detect meaningful differences. Nevertheless, contrary to prior findings1, we observed no differences in creative improvement between the incubation tasks, nor a general benefit of incubation compared to the “no break” condition. However, when using forward flow, we found a significant increase in creative improvement in the “repeated prompt” condition compared to the “new prompt” condition, suggesting that writing based on the same prompt facilitates the generation of stories that progress in a more divergent manner. This effect, however, does not appear to be related to incubation and may instead reflect participants expanding upon their initial ideas, enabling quicker divergence during the second writing session.

Our hypothesis (H2) positing that an implicit writing goal during incubation would enhance mind wandering and creativity was not supported, suggesting that goal-oriented top-down processes during incubation do not have significant influence on creativity. These results contrast with those of Medd and Houtz9, who reported that the presence of a writing goal facilitated creative incubation. However, their study involved explicit instruction to think about the pre-incubation story during the break, whereas our design employed an implicit goal to better investigate mind wandering. It is possible that the implicit goal in our study was too subtle to produce measurable effects, as it was conveyed via a single instructions screen which may have been overlooked by some participants. Additionally, prior work has shown that tasks engaging different components of vigilance (i.e., arousal vs. executive function) elicit different proportions of intentional versus unintentional mind wandering81. Thus, differences in how our incubation tasks (e.g., mindfulness vs. 2-back) engaged these components could have masked differences in creative improvement – particularly if creativity is more sensitive to one type of mind wandering over the other22,48. Finally, a more general limitation of our study is the relatively small sample size, which, given the number of experimental conditions, may have limited our ability to detect significant effects. Our preregistered power analysis was designed to detect a general incubation effect (incubation vs. no break) using the mean effect size reported by Sio and Ormerod1. Although we increased our planned sample size to better accommodate the multiple factors and interactions included in the final study design, post hoc sensitivity analyses indicated that the final sample sizes afforded sufficient power only to detect moderate or larger effects for hypotheses H1 and H2 (see Supplementary Table S4).

Our third hypothesis (H3) proposed that mind wandering during incubation would correlate with creative improvement in participants with prior exposure to their post-incubation writing prompt. While this effect was not observed using human-scored creativity, we did find the hypothesized relationship using creativity scores from forward flow and GPT-4. Notably, this relationship was specific to participants who received a repeated writing prompt, suggesting that the effect was not due to mind wandering fostering a general mental state conducive to creativity. Instead, it seems to reflect the role of creative incubation following prior exposure to the writing prompt. Further, explicit thought about the story during incubation was not associated with creative improvement, providing support for the unconscious-work hypothesis – incubation effects may arise from gradual, undirected processes occurring during the incubation of the problem representation rather than from continued conscious problem solving during the break. Although these findings do not establish causality, the observed relationship between incubation mind wandering and creative improvement remained significant after controlling for dispositional mind wandering, suggesting that it is not merely attributable to highly creative individuals tending to mind wander more frequently – a pattern previously identified in the literature82 – and may instead be specific to mind wandering induced during the study’s incubation period. However, it is important to note that this relationship was relatively weak, indicating the meaningful presence of other cognitive and environmental factors accounting for fluctuations in creativity over time.

The relationship between mind wandering and creative improvement appears to differ across human-scored and computational measures, though the underlying reasons for this disparity remain unclear. While significant correlations were observed between the various creativity measures, each may capture distinct dimensions of the creative process. Forward flow, for instance, primarily quantifies the semantic divergence in idea progression, capturing the dynamic generation of novel concepts. In contrast, human raters may adopt a more holistic approach, integrating multiple facets of creativity in their assessments. Mind wandering has been associated with spreading activation through semantic space8, which may align more closely with forward flow’s measurement of semantic divergence in idea progression. It is possible that additional synthesis and integration of these ideas is required for humans to consider the product as creative. However, this interpretation does not fully explain differences in our results using human-scored creativity and GPT-4 scores. Prior research indicates that GPT models exhibit greater alignment with human creativity ratings than semantic distance measures77, raising questions about the reasons for the observed differences across our analyses. Since the consensual assessment technique avoids explicit evaluation criteria to preserve ecological validity, the absence of standardized criteria makes it difficult to interpret inconsistencies across analytical methods. It is possible that including additional expert raters could enhance the reliability of human-scored creativity and possibly provide greater convergence with GPT-4 scores.

To our knowledge, the present study is the first to report a significant relationship between mind wandering during an incubation period and within-subject improvement in creativity. Prior investigations have either not tested this relationship directly23, not considered within-subject changes in creativity20, or found the relationship to be nonsignificant24,25,26. Our results provide stronger evidence for a small but beneficial role of mind wandering during the incubation of a task that resembles real-world creative behavior. Increasing the sample size, incorporating additional human raters, and expanding to different domains of creativity could enhance statistical power, reliability, and generalizability. Future research may explore how the specific content of mind wandering interacts with problem representations during incubation to support creativity, as well as whether these effects are domain-specific or indicative of a broader, domain-general mechanism. As our study did not explicitly distinguish between intentional and unintentional mind wandering in the retrospective questionnaires, future studies could also clarify whether states characterized by intentional versus unintentional mind wandering offer greater benefits to creativity – especially given that prior work on the topic has largely employed trait-level measures22,48. Finally, neuroimaging paradigms could help elucidate the neural mechanisms underlying incubation benefits, particularly the role of the default mode network given its involvement in both mind wandering83,84,85,86 and creative thought87. Understanding these neural mechanisms could provide deeper insight into how mind wandering contributes to creative thought and inform strategies to enhance creative problem solving.