Interruption, recall and resumption: a meta-analysis of the Zeigarnik and Ovsiankina effects

It is a common belief that an interrupted intention is better remembered than a completed one, less so, however, that an interrupted intention reliably urges us toward its completion. These beliefs resulted from two famous studies inspired by Kurt Lewin at the beginning of the twentieth century. First, Zeigarnik (1927) investigated the influence of interrupted intentions on subsequent memory. She presented participants with numerous tasks, of which some were interrupted, and reported that afterward, participants recalled more interrupted than finished tasks (the so-called Zeigarnik effect). Second, using a similar approach, Ovsiankina (1928) reported that participants tended to resume the interrupted tasks unprovoked when given the opportunity (the so-called Ovsiankina effect). Lewin (1926) proposed that forming an intention builds up a tension by attaching valence (Aufforderungscharakter) to specific objects and situations. This inner tension represents a quasi-need, and persists until the successful fulfillment of the intention. If an intention is interrupted, this tension cannot be discharged which results in better recall of unfinished activities compared to finished ones and also drives unfinished activities to being resumed. Although this theory is intuitively plausible, it turned out that the empirical evidence, in particular for the Zeigarnik effect, was difficult to replicate (see MacLeod, 2020 for an overview; see Van Bergen, 1968 for an in-depth replication of the original study). Ironically, the lesser-known Ovsiankina effect has shown more consistent empirical support. The purpose of this study is to present a meta-analysis of the existing evidence for both the Zeigarnik effect and the Ovsiankina effect.

We focus on studies that used the same methodological approach as Zeigarnik and Ovsiankina, an approach subsequently labeled “interrupted task paradigm “(e.g., Atkinson, 1953; Cooper, 1983; Green, 1963; Mahler, 1933; Moot et al., 1988; Pachauri, 1935; Rösler, 1955; Van Bergen, 1968; Weiner, 1966). In this paradigm, several tasks are presented in sequence, half of which are interrupted, whereas the other half are completed. Then, the recall of the tasks or their resumption rate is assessed.

Zeigarnik effect

For the Zeigarnik effect, over the ensuing years, it became evident that it is not a universally reliable phenomenon but appears to be bound to specific circumstances. One line of research focused on individual differences, another approach focused on situational effects. We included these influencing factors in the meta-analysis and we next provide a brief overview of the most relevant studies.

Individual differences

Atkinson (1953) investigated the Zeigarnik effect in respect of achievement motivation. A high achievement motive is seen as a stable disposition to strive for achievement or success (Atkinson, 1957). Participants were divided into two groups (high vs. low achievement motivation). The experimental setting involved a task-oriented condition, in which no attempt at creating an experimental atmosphere was made, a relaxed-orientation condition, in which a relaxed atmosphere was created, and an achievement-orientation condition, in which a competitive atmosphere was created. High achievement-motivated participants recalled remarkably more interrupted tasks than completed tasks in the achievement-oriented condition, whereas low achievement-motivated participants exhibited the opposite pattern. These results underlined the importance of individual differences and situational influences in relation to the Zeigarnik effect. Consequently, achievement motivation gained popularity in this research field (Cooper, 1983; Moot et al., 1988; Raffini and Rosemier, 1972; Reiss, 1968; Weiner, 1966).

Following these results, a line of research began investigating the influence of other potential individual differences on the Zeigarnik effect. Some authors believed task-involvement to be a critical factor in the occurrence of the effect. Green (1963), for instance, noted that participants in replication studies differed in their requirement to participate: While some studies included volunteers, others tested students who were required to participate as part of their curriculum. He concluded that volunteers should be more task-involved, aiming to complete the task successfully. Accordingly, volunteers exhibited a stronger Zeigarnik effect than non-volunteers. Similarly, a heightened task-involvement was present in dominant individuals due to their task-oriented attitudes and strong completion-tendencies (Gough et al., 1951; Sinha and Sharan, 1976). When testing both dominant and submissive individuals, dominant individuals exhibited a Zeigarnik effect. Submissive individuals, however, exhibited the opposite pattern, an inversion of the Zeigarnik effect.

Other authors noted that an inversion of the Zeigarnik effect usually occurred in anxiety-inducing situations when the ego was involved (Farley and Mealiea, 1971; Glixman, 1949; Rosenzweig, 1943). It was argued that recall of failure of unfinished tasks would threaten the ego and should, therefore, be repressed (Weiner et al., 1968). Hence, it was hypothesized that individuals with a tendency to avoid threatening stimuli (e.g., repressors) should exhibit this inversion of the Zeigarnik effect, whereas individuals with a tendency to approach threatening stimuli (e.g., sensitizers) should not. Of the studies contrasting repressors and sensitizers, however, only one successfully demonstrated differing recall patterns: While Farley and Mealiea (1973) found an inverse Zeigarnik effect in both repressors and sensitizers, Hofstaetter (1985) successfully demonstrated a Zeigarnik effect in sensitizers but an inverse Zeigarnik effect in repressors.

In an attempt to resolve this inconsistency, Claeys (1969) proposed two different mechanisms, a drive urging toward the completion of unfinished tasks resulting in better memory of interrupted tasks (i.e., the Zeigarnik effect), and a tendency to recall completed tasks in favor of the ego, which he labeled “success factor”. He hypothesized that neurotic individuals would exhibit a stronger Zeigarnik effect compared to stable individuals due to their tense and overdriven nature. On the other hand, introverted individuals would spontaneously evaluate their performance on tasks as good or bad compared to extroverted individuals and would therefore be in need of this “success factor”, resulting in an inverse Zeigarnik effect. Individuals high in neuroticism and low in introversion both recalled more interrupted tasks than did individuals low in neuroticism and high in introversion. Since then, however, the relation of neuroticism and introversion with the Zeigarnik effect has not been investigated again.

Situational influence

Besides the influence of individual differences on the Zeigarnik effect, the situational influence has been identified as an important factor. In a series of experiments Marrow (1938a, 1938b) varied the situational influence by using different instructions. In one experiment, participants were given a neutral and sober description of the experimental procedure. In a second experiment, participants (i.e., American students) were informed that the purpose of the study was to replicate a previous experiment conducted with a German sample, for which a preliminary analysis had revealed superior performance by the American students. In a third experiment, participants were given the same instruction as in the second experiment but were told that a preliminary analysis had revealed superior performance by the German students. The minor memory advantage of interrupted tasks observed in the first experiment drastically increased in the second experiment when participants were encouraged by the instruction but decreased again in the third experiment when discouraged by a demoralizing instruction.

Following Marrow (1938b, 1938a), researchers manipulating the situational influence followed a threefold distinction of situational conditions: The first was a neutral and task-focused condition sometimes used as a baseline measure called task orientation condition (Atkinson, 1953; Green, 1963; Hays, 1952) or neutral condition (Caron and Wallach, 1957). The second was a more formal and demanding condition designed to emphasize the importance of the tasks by framing them as a measure of intellect labeled as achievement orientation condition (Alper, 1946; Atkinson, 1953; Hays, 1952), ego orientation condition (Green, 1963), formal condition (Claeys, 1969), or stressful condition (Caron and Wallach, 1957; Glixman, 1949). The third was a more informal condition to minimize the focus on the task, in which the subjects performed the tasks under the pretext of assisting the experimenter in testing the material, called either a relaxed orientation condition (Alper, 1946; Atkinson, 1953), or informal condition (Claeys, 1969). The influence of these conditions on the Zeigarnik effect produced mixed effects, likely due to their interaction with individual differences, as Atkinson (1953) had previously demonstrated.

Inspired by Atkinson’s (1953) results, Weiner (1966) became interested in how the social context could determine achievement-related behavior. Particularly, he was interested in how participants in competition with another same-sex or opposite-sex competitor would vary in their achievement-related behavior, manifesting in different recall patterns. Women exhibited a greater Zeigarnik effect when competing against other women compared to competing against men, although the result did not yield significance. Conversely, men exhibited a greater Zeigarnik effect when competing against women and an inverse Zeigarnik effect when competing against other men. Weiner (1966) concluded that female competitors were more appropriate at enhancing achievement striving than male competitors, as both women and men exhibited a greater Zeigarnik effect when competing with another woman.

Another variable suspected to influence the Zeigarnik effect was the influence of interpolated and subsequent tasks. When interpolating simple and complex tasks, completed tasks were more frequently recalled when followed by a complex task. In contrast, interrupted tasks were more frequently recalled when followed by a simple task (Hays, 1952). When a new activity followed a set of multiple interrupted and completed tasks before recall, individuals recalled more completed tasks when presented with a demanding task, such as performing a new set of tasks. In contrast, they recalled more interrupted tasks when followed by a less demanding task, such as reading a book (Prentice, 1944). The Zeigarnik effect, therefore, manifested mainly in the presence of interpolated or subsequent cognitively undemanding tasks. However, it appeared to reverse when cognitive resources are exhausted by complex interpolated and subsequent tasks through retroactive inhibition.

The Ovsiankina effect

The Ovsiankina effect had also received subsequent attention and appeared to be more reliable. Several authors demonstrated that, when given the opportunity to resolve a task, participants reliably resumed the task when the opportunity arose (Katz, 1938; Mahler, 1933; Nowlis, 1941; Rethlingshafer, 1941). This effect occurred in adults, children, and individuals with intellectual disabilities (Rethlingshafer, 1941; Rösler, 1955). Even when presenting participants with interesting alternative tasks, participants reliably resumed their interrupted tasks when presented with the choice (Mahler, 1933). This tendency positively correlated with the attractiveness of the interrupted task, however, it decreased the more the alternative task resembled the interrupted task (Henle, 1942; Lissner, 1933). A recent computerized study further showed that when participants were interrupted in their activity by a prompt under the pretext of a network problem and instructed to wait for 60 s but given the opportunity to resume, they dismissed the prompt and resumed their activity reliably (Birk et al., 2020).

The Ovsiankina effect was also investigated in different clinical samples. Ovsiankina herself, for instance, demonstrated that the tendency to spontaneously resume interrupted tasks was lower in a sample of participants with schizophrenia (Rickers-Ovsiankina, 1937). She concluded that individuals with schizophrenia lacked the ability to form firmly segregated tension systems, similar to their inability to maintain a constant stream of thought and pursuing goals. Similarly, Chorus (1942) demonstrated that a small sample of children with hyperactivity disorder (“psychomoteurs purs”) completely failed to resume the interrupted tasks. Their physical behavior, although very pronounced, lacked determination, translating to other domains of their life, such as for instance their attention, their thoughts, or their work: Their chaotic nature interfered with their ability to act goal directed. Both studies thus emphasized that the prerequisite for the Ovsiankina effect is the ability to maintain goals and perform goal-directed behavior.

A study by Malerstein (1969) raised an intriguing question regarding the necessity of explicit memory for the resumption of tasks. In their study, participants were interrupted in their completion of a puzzle by a substitute task. Afterward, the experimenter left the room and resumption of the initial puzzle was recorded. Subjects included healthy controls, hospitalized alcoholics, and Korsakoff patients. The latter resulted in the most interesting findings, as Korsakoff’s syndrome is characterized by global amnesia (Arts et al., 2017): Although the resumption rate was lower than healthy controls, resumption of the task occurred in the Korsakoff patients, even in those patients who did not recall the task. However, it is unclear to what extent the patients resumed the task out of boredom to bridge the 20 min waiting period, and the author mentions that their manner or incidental comments may have influenced the results.

The effect seemed to reflect a truly intrinsically motivated tendency. McGraw and Fiala (1982) were interested in how extrinsically motivating participants—through a monetary incentive for participation—would affect the resumption rate of interrupted tasks. In one condition, participants were told beforehand about the financial compensation for their participation; in another condition, participants were not informed about the compensation. Interestingly, the monetary incentive noticeably lowered the resumption rate of interrupted tasks. The authors concluded that by introducing an incentive for participation, the participants’ intrinsic goal for task-completion was replaced by an extrinsic goal of participating, which rendered the resumption of a task obsolete.

Reeve et al. (1986) wondered whether the Ovsiankina effect was the same as intrinsic motivation and, if not, how each could be delimited. Participants were tasked with solving a series of puzzles in a competitive setting and were either allowed to beat the competitor (competence feedback), were prearranged to lose against the competitor (incompetence feedback), or did not compete (no feedback). Participants were interrupted on the tasks by a time limit. The authors used two indices in their study: the resumption rate measuring the Ovsiankina effect, and the time spent on the task after the resumption as a measure of interest in the task and an intrinsic motivation index. The authors found that competence feedback increased both the Ovsiankina effect and the intrinsic motivation index, but the effect was greater on intrinsic motivation. Additionally, the study found that participants displayed intrinsically motivated behaviors even after completing tasks, distinguishing intrinsic motivation from the Ovsiankina effect.

Liberman et al. (1999) later reintroduced the idea of manipulating motivation in relation to the resumption or substitution of interrupted tasks. They assigned participants to two conditions: a promotion condition, in which tasks were framed as a gain-nongain situation, or a prevention condition, in which the task was framed as a loss-nonloss situation. In Experiment 1, for instance, participants in the promotion condition were awarded points when completing a task but no points when failing to complete a task, whereas in the prevention condition, participants were deducted points when failing to complete a task but not when successfully completing a task. Participants consistently resumed interrupted tasks more frequently than substituting the task in the prevention condition compared to the promotion condition. Therefore, individuals with a prevention focus may be more inclined to maintain stability and avoid losses by resuming interrupted tasks.

Summary

Whereas the Ovsiankina effect has proven to be a rather reliable phenomenon, the evidence surrounding the Zeigarnik effect presents a more complex picture. In particular, the interplay between individual differences and situational influences complicates understanding the Zeigarnik effect. Individual differences such as achievement motivation, the voluntary or involuntary nature of participation, the tendency to repress ego-threatening stimuli, and personality features seem to interact in a complex way with situational influences.

The purpose of this meta-analysis was to quantify the magnitude of these effects across studies and conditions. The need for synthesizing the findings becomes apparent with many replication attempts using different materials, assessing numerous individual differences, and manipulating the experimental atmosphere in various ways. Thus, the purpose of this study was to investigate to what extent interrupted tasks profit from an average memory advantage and how likely the resumption of a task is when interrupted. So far, no meta-analysis has addressed this question, likely due to the heterogeneous approaches used and the complexity of synthesizing these findings.

Method

The present meta-analysis used anonymized data and was therefore exempt from approval by the local Ethics Committee of the Faculty of Human Sciences, University of Bern, in accordance with national law.

Literature search

The goal of our study was to identify the magnitude of both the Zeigarnik and Ovsiankina effect. Specifically, we were interested in the following: (1) the average ratio of recalled interrupted and finished tasks (Zeigarnik effect); (2) the percentage of resumed interrupted tasks (Ovsiankina effect); (3) if available or computable, the average effect size of the memory advantage for unfinished tasks compared to finished tasks. To search for relevant studies, we conducted our literature search on the 14th of September 2023 using PsycInfo and PSYINDEX using the search terms Zeigarnik, Zeigarnik-Effect, Zeigarnik Effect, Ovsiankina, Ovsiankina-Effect, Ovsiankina Effect, interrupted task(s), and unfinished task(s). This first search resulted in a total of 1455 publications and a total of 1349 publications after removing duplicates (see Fig. 1).

Fig. 1
figure 1

Flow Chart of Study Selection Process for Meta-Analysis.

These 1349 publications were then screened for title and abstract. Publications were considered for full-text screening if the following criteria were met: (a) the study was empirically-quantitative; (b) the study focused on interrupted and finished tasks; (c) the study measured retrospective recall or resumption of interrupted tasks. All studies were screened by the first author for inclusion of full-text screening. Studies with ambiguous relevance to the inclusion criteria were screened in full text. In addition, to estimate interrater agreement, a random sample of 60 studies was rated by an independent rater instructed on the important criteria. The interrater agreement on inclusion or exclusion was high (κ = 1.00). Screening titles and abstracts resulted in 124 remaining publications considered for inclusion.

The remaining 124 publications were then screened for their definitive inclusion by the first author. Studies were included for data extraction if they met all the following criteria: (a) The study was empirically-quantitative; (b) the study used interrupted and finished tasks and assessed free recall or the resumption of these tasks; (c) the resumption of tasks in the study was not forced by the experimenter (d) the study provided sufficient information on recall and resumption to compute the ratio of recalled interrupted tasks to recalled finished tasks if not already provided; (e) the study was published in either English, German, or French; (f) the publication was available. Screening full texts resulted in a final sample of 59 publications. Of these publications, 38 investigated the Zeigarnik effect, 20 investigated the Ovsiankina effect, and 1 investigated both.

Data extraction

We coded the following data: Year of publication, the assessed effect (Zeigarnik effect, Ovsiankina effect, or both), number of experiments, type of experiment (experimental or quasi-experimental), total sample, sub-sample by conditions, sample type (i.e., students, adults, children), the sample size by gender, the average age and standard deviation of the sample as well as the age range (minimum and maximum age), the country in which the study was conducted, the different experimental conditions, the individual differences assessed, the type of tasks used, the manipulation of the interrupt, whether the task-interruption was manipulated within- or between-subjects, and the recall (immediate vs. delayed).

To compute the magnitude of each effect, we coded the following measures found throughout the literature (Butterfield, 1964): For the Zeigarnik effect, we first retrieved the value we label \({mean\; ratio}\frac{{IR}}{{CR}}\). This value relates to the averaged ratio of interrupted recalled (IR) and completed recalled (CR) tasks across all participants. Typically, the ratio of IR/CR is first computed for each participant separately and then averaged in a second step. This measure was originally introduced by Zeigarnik (1927) but is massively influenced by single extreme-values and outliers.

A second measure was computed to assess the Zeigarnik effect: The value we would label \({ratio}\frac{{mean}({IR})}{{mean}({CR})}\). This measure was later introduced and provided the ratio of the average recalled interrupted tasks to the average recalled finished tasks. More specifically, the recall of interrupted and finished tasks is first averaged across participants before their ratio is computed. This value is substantially less influenced by individual extreme-values and outliers and, therefore, more suitable to represent an unbiased Zeigarnik effect, and represents the most commonly used measure.Footnote 1

Next, we computed the proportion of interrupted tasks recalled to total tasks recalled for the Zeigarnik effect, which we labeled \({proportion}\frac{{IR}}{{TR}}\). This measure relates to the average amount of all recalled tasks, which are interrupted tasks. It was first introduced by Marrow (1938a), criticizing Zeigarnik’s (1927) approach. The extent of memory superiority of interrupted and completed tasks is reflected differently based on the direction in which the superiority lies. By computing the ratios as mentioned above, the relation of interrupted recalled tasks to completed recalled tasks is distorted in favor of the interrupted ones.Footnote 2 On the other hand, the computation of interrupted tasks to total tasks provides an unbiased measure of the memory advantage of unfinished tasks.

Computing the effect size of the Zeigarnik effect turned out to be more complicated than expected. Most studies did not report any effect size, did not provide sufficient information to compute them, or simply did not investigate the difference in recall of both task types because other variables were of interest. Two exceptions are the studies by Hays (1952) and House and McIntosh (2000). However, we excluded the latter as the interruption of the task was manipulated between subjects. Thankfully, studies such as Zeigarnik (1927), Schlote (1930), Lewis (1944), Alper (1946), and Van Bergen (1968) provided detailed information on each subject’s recall. This allowed us to compute Cohen’s dz for paired sample t-tests by dividing the mean difference by the standard deviation of the mean difference (Lakens, 2013).

For the Ovsiankina effect, we calculated the resumption rate (%), which reflects the percentage of individuals who resumed the task after interruption. Ovsiankina (1928) differentiated in her original publication between resumed tasks and tasks with a tendency for resumption. As subsequent studies combined both types of resumption, we consolidated both values into one value of resumption. The resumption rate (%), therefore, reflects the percentage of individuals resuming interrupted tasks when the opportunity arises.

Results

In the first step, we calculated the weighted mean by sample size of the outcome measures for each publication across their experimental conditions. We aimed to partial out any manipulations of the experimental atmosphere and individual differences so that the fundamental effects could be calculated. Then, we again computed an overall weighted mean by sample size across all publications to converge the values into one combined value per measure. For the Zeigarnik effect, we combined the commonly used situational manipulations into three distinct conditions and analyzed the weighted average effect. Moreover, we analyzed the influence of achievement motivation across the different studies. The summary of the included studies and their averaged effect sizes can be found in Table 1 for the Zeigarnik effect and Table 2 for the Ovsiankina effect. A graphical depiction of the studies and their averaged effect sizes in relation to their year of publication is presented in Fig. 2 for the Zeigarnik effect and in Fig. 3 for the Ovsiankina effect.

Table 1 Description of the Included Studies on the Zeigarnik Effect and Averaged Effect Sizes Across Conditions.
Table 2 Description of the Included Studies on the Ovisankina Effect and Averaged Effect Sizes Across Conditions.
Fig. 2: Results from studies assessing the Zeigarnik effect.
figure 2

Each point represents the ratio of the averaged mean recall of interrupted tasks to completed tasks from individual studies. The size of each point corresponds to the study’s sample size. A dotted line at a value of 1.0 serves as a reference, indicating an equal level of recall between interrupted and completed tasks. Points further to the right than the line validate the Zeigarnik effect, whereas points further to the left signify its inverse.

Fig. 3: Results from studies assessing the Ovsiankina effect.
figure 3

Each point represents the percentage of resumed interrupted tasks from individual studies. The size of each point corresponds to the study’s sample size.

Zeigarnik effect

First, we analyzed the available effects of the \({mean\; ratio}\frac{{IR}}{{CR}}\), the measure Zeigarnik (1927) used in her original publication. This value was provided by or could be computed from six additional publications. Including the value provided by Zeigarnik (1927), a weighted \({mean\; ratio}\frac{{IR}}{{CR}}=1.13\) resulted (N = 7 publications), suggesting that interrupted tasks are recalled 13% better than finished tasks. If the value provided by Zeigarnik (1927) is excluded, a weighted \({mean\; ratio}\frac{{IR}}{{CR}}=1.09\) results (N = 6 publications), suggesting that interrupted tasks are recalled 9% better than finished tasks.

Next, we analyzed the available effects of the \({ratio}\frac{{mean}({IR})}{{mean}({CR})}\), representing the most commonly used measure of the Zeigarnik effect. This value was provided or could be computed from 37 additional publications. Including the value calculated in Zeigarnik’s (1927) publication, a weighted \({ratio}\frac{{mean}({IR})}{{mean}({CR})}=0.99\) resulted (N = 38 publications), suggesting that interrupted tasks are recalled about the same as finished tasks. If Zeigarnik (1927) is excluded, a weighted \({ratio}\frac{{mean}({IR})}{{mean}({CR})}=0.99\) persists (N = 37 publications), again suggesting no reliable difference in recall between interrupted and finished tasks (see Fig. 4). These results further underscore that the values from Zeigarnik’s (1927) original computations are inflated.

Fig. 4: The Zeigarnik effect across studies, expressed as ratio of the average recalled interrupted tasks to the average recalled finished tasks measures.
figure 4

Data showcased include Zeigarnik’s original findings from (1927), the weighted mean outcome from our meta-analysis that incorporates Zeigarnik’s (1927) study, and the weighted mean outcome excluding Zeigarnik’s (1927) original data.

Further, we analyzed the available effects of the \({propotion}\frac{{IR}}{{TR}}\). This value was provided or could be computed from thirtheen additional publications. Including the value calculated in Zeigarnik’s (1927) publication, a weighted \({percentage}\frac{{IR}}{{TR}}=49.43 \%\) results (N = 14 publications), suggesting a memory disadvantage of interrupted tasks. If Zeigarnik (1927) is excluded, a weighted \({percentage}\frac{{IR}}{{TR}}=49.16 \%\) results (N = 13 publications), confirming a memory disadvantage for finished tasks.

Finally, we computed the effect size of Cohen’s dz for the memory advantage of interrupted tasks. Weighting the computed effect sizes by the sample size resulted in an overall Zeigarnik effect of dz = 0.15 (N = 8 publications), thus reflecting a small effect (Cohen, 2013).

Situational influence

We then investigated how the experimental atmosphere is related to the \({ratio}\frac{{mean}({IR})}{{mean}({CR})}\). For this, we grouped experimental conditions into three distinct categories: A neutral condition, in which the experiment focuses on the tasks themselves (e.g., task orientation or neutral conditions), an achievement condition, in which experimental situation and instructions induce a performance atmosphere (e.g., achievement orientation, ego, formal, or stressful conditions), and a relaxed condition, in which a relaxed atmosphere is deliberately created (e.g., relaxed or informal conditions). We calculated the weighted mean \({ratio}\frac{{mean}({IR})}{{mean}({CR})}\) for each condition (N = 16 publications). The weighted average ratio for the neutral conditions was \({ratio}\frac{{mean}({IR})}{{mean}({CR})}=0.96\), for the achievement conditions \({ratio}\frac{{mean}({IR})}{{mean}({CR})}=0.88\), and for the relaxed conditions \({ratio}\frac{{mean}({IR})}{{mean}({CR})}=1.07\). The experimental atmosphere appeared to influence recall patterns, with interrupted tasks recalled slightly less well in achievement-oriented settings, about equally in neutral settings, and slightly better in relaxed conditions.

Achievement motivation

Next, we computed the weighted mean \({ratio}\frac{{mean}({IR})}{{mean}({CR})}\) for the most prominent individual difference: Achievement motivation. In the seven studies that assessed achievement motivation, participants have typically been categorized into high, moderate, and low achievement motivation (N = 7 publications). The weighted average ratio for individuals with high achievement motivation was \({ratio}\frac{{mean}({IR})}{{mean}({CR})}=1.00\), for moderate achievement motivation \({ratio}\frac{{mean}({IR})}{{mean}({CR})}=0.92\), and for low achievement motivation \({ratio}\frac{{mean}({IR})}{{mean}({CR})}=0.96\). Achievement motivation showed little impact on recall, with individuals across high, moderate, and low motivation levels recalling interrupted and completed tasks at similar rates.

Ovsiankina effect

For the Ovsiankina effect, we assessed the sole measure used for the resumption of interrupted tasks, the resumption rate (%). Including the value obtained in Ovsiankina’s (1928) publication, a weighted resumption rate (%) = 67.00% results (N = 21 publications), demonstrating that interrupted tasks are frequently resumed. If Ovsiankina (1928) is excluded, a weighted resumption rate (%) = 66.79% results (N = 20 publications), exhibiting only a minor difference in the resumption rate. Still, studies reliably demonstrated that interrupted tasks are resumed roughly 67% of the time (see Fig. 5), well above the chance rate of 50%.

Fig. 5: The Ovsiankina effect across studies (percentage of resumed interrupted tasks).
figure 5

Data showcased include Ovsiankina’s original findings from (1928), the weighted mean outcome from our meta-analysis that incorporates Ovsiankina’s (1928) study, and the weighted mean outcome excluding Ovsiankina’s (1928) original data.

Discussion

The present meta-analysis revealed significant inconsistencies in the Zeigarnik effect. The measure applied originally by Zeigarnik (1927), which we labeled \({mean\; ratio}\frac{{IR}}{{CR}}\), overestimated the effect and is particularly susceptible to extreme values. If a ratio is to be computed, the measure we labeled \({ratio}\frac{{mean}({IR})}{{mean}({CR})}\) is more suitable, although not without its drawbacks. It distorts values in favor of interrupted tasks. As such, we would recommend the use of the measure introduced by Marrow (1938a) that we labeled \({proportion}\frac{{IR}}{{TR}}\), that is, the average proportion of interrupted tasks recalled in relation to the total amount of tasks. This measure does not suffer from the same disadvantages as the other measures and provides information on the recall of finished tasks as well.

Our analysis of the Zeigarnik effect is quite sobering. Without considering the result of Zeigarnik’s (1927) original study, a ratio of interrupted to completed tasks of 0.99 is computed. Interrupted tasks account for an average of 49.16% of the recalled tasks. Despite being unable to calculate the effect size for numerous studies, the Zeigarnik effect yielded an average effect size of dz = 0.15, indicating a small effect. The current findings do not support a memory advantage for interrupted tasks when situational influences and individual differences are not accounted for. Therefore—of anything—, the Zeigarnik effect should relate to situational influences, individual differences, or the interaction of the two.

When distinguishing experimental situations into the three distinct conditions, the Zeigarnik effect appeared in what we labeled a relaxed condition. No Zeigarnik effect could be observed in situations that we labeled neutral conditions. In these situations, the experimental atmosphere is not manipulated through instructions or the experimenter’s behavior with a strict focus on the tasks themselves. Similarly, no Zeigarnik effect occurred in the situations that we labeled achievement conditions, in which the tasks are framed as some intelligence measure. One possible explanation is that such situations evoke different recall patterns based on individual differences. Success-oriented individuals might find such situations exciting and stimulating, whereas failure-oriented individuals might perceive such situations as threatening to the ego (Ghibellini and Meier, 2024). The former could favor the retrieval of interrupted tasks, whereas the latter might favor the retrieval of completed tasks.

Focusing solely on achievement motivation, no clear pattern in the Zeigarnik effect occurs. Highly achievement-motivated individuals recalled a comparable amount of interrupted and finished tasks, whereas moderately achievement-motivated individuals recalled even more finished tasks than slightly achievement-motivated individuals. We assume that achievement motivation is highly dependent on situational influences to evoke specific recall patterns. Unfortunately, only the study by Atkinson (1953) provided us with measures of both the manipulated experimental atmosphere and is relation to achievement motivation. In his study, individuals high in achievement motivation exhibited the strongest Zeigarnik effect in an achievement condition (\({ratio}\frac{{mean}({IR})}{{mean}({CR})}=1.24)\). In contrast, individuals low in achievement motivation exhibited the strongest Zeigarnik effect in a relaxed condition \(({ratio}\frac{{mean}({IR})}{{mean}({CR})}=1.27)\).

From the studies included in this meta-analysis, the results of Baddeley (1963) exhibit the most notable memory advantage for interrupted compared to finished tasks. However, Baddeley’s approach differs from the other studies. Participants were presented with anagrams. If they did not solve them within sixty seconds, they were shown the solution. Afterward, they had to recall the anagram solution. Thus, although the intention of solving the anagram was interrupted, the subsequent presentation of the anagram solution also terminated the intention of solving the anagram. Hence, unsolved anagrams represent interrupted but finished intentions, which stands in stark contrast to other studies measuring the recall of interrupted and unfinished intentions. These results cannot be compared to the classical Zeigarnik effect studies, and Baddeley (1963) himself labels the results “Zeigarnik-like”. Therefore, other mechanisms must be at play, such as for instance the experience of discrepancy between the anticipation of successfully solving an anagram and its subsequent failure resulting in a memory advantage for unsolved anagrams.

The Ovsiankina effect, on the other hand, appears to be a more general tendency. However, we must note here that the Ovsiankina effect was addressed more descriptively. Studies investigating the effect assessed the percentage of resumed interrupted tasks. If some degree of experimental manipulation was involved, researchers usually tried to increase or decrease this tendency and evaluated their findings descriptively or in relation to a baseline condition. This, however, does not negate the fact that a general tendency to resume interrupted tasks can be observed reliably. It is, therefore, possible that the Ovsiankina effect received less attention than the Zeigarnik effect not only because of its consistent and readily observable nature, but also because it remained less well known among researchers—who may have been more drawn to the more widely discussed and theoretically intriguing Zeigarnik effect.

The difficulty, however, lies in the interpretation of the resumption rate. We were careful to include only studies in which the experimenter did not force the resumption of tasks, allowing the resumption to occur naturally. Since the included studies used heterogenous approaches in measuring resumption, computing a baseline-resumption rate is challenging: While some experimenters left the room or made participants wait and observed the self-initiated resumption of the tasks with or without the presence of alternative tasks (Birk et al., 2020; Henle and Aull, 1953; Katz, 1938; Mahler, 1933; McGraw and Fiala, 1982; Nowlis, 1941; Reeve et al., 1986; Sternlicht and Wanderer, 1966), others forcefully interrupted the task or presented participants with new tasks and measured the resistance of subjects to return to the original tasks (Rethlingshafer, 1941; Rösler, 1955). Ovsiankina (1928) used both approaches in her original study. We would argue that the resulting resumption rate from our analysis should not be interpreted as an absolute measure, but rather should be seen as indicative. On average, even when different approaches are implemented, interrupted tasks are resumed more often than they are not, effectively supporting the existence of Ovsiankina effect.

It is important to note that, due to the specific search terms employed, this meta-analysis does not encompass all potential publications on the topic. The computed effects and ratios are limited to the studies included in our meta-analysis. We recognize that employing a more comprehensive set of search terms might have yielded a broader spectrum of studies. However, narrowing our search terms helped us to retrieve studies directly related to the effects in question (Liberati et al., 2009). Further, including more lenient search terms could have increased the heterogeneity of research approaches, making it even more challenging to draw cohesive conclusions (Higgins et al., 2003). The included studies already employed a broad array of approaches to induce the Zeigarnik and Ovsiankina effect, which already complicated the synthesis of the findings. Moreover, limiting search terminology made the systematic review of studies more feasible due to limited time resources (Shamseer et al., 2015).

The screening and selection of the literature resulted in a large number of excluded studies. Most often, studies were excluded from the final sample as they did not fulfill the inclusion-criteria of being empirically-quantitative and referred to the original studies of Zeigarnik (1927) or Ovsiankina (1928), or simply did not relate to the recall and resumption of interrupted tasks. Other prominent exclusion criteria were studies not reporting sufficient data to compute the relevant measures, or a lack of availability. While the limited number of studies included in the computation of each measure needs to be taken into consideration when interpreting the results, we would argue that our meta-analysis still provides valuable new insight. By bolstering the number of participants in combining multiple studies, as each study included an average of above 100 participants, our findings hold more statistical power, reduce sampling error and variability, and are more precise than single studies (Borenstein et al., 2021; J. P. Higgins, 2008; Valentine et al., 2010).

The question now arises as to why Zeigarnik (1927) could demonstrate the memory advantage for interrupted tasks without manipulating the situation and considering individual differences. Here, a historical perspective may be necessary. Zeigarnik (1927) herself states that participants carried out the tasks conscientiously either out of a sense of duty toward the experimenter, out of ambition, or motivated by the tasks themselves. These observations underline the heightened task involvement of participants during the experiment. If we consider that the experiments were conducted at the beginning of the 20th century, we can assume a high authority of the experimenter, given their association with academic institutions. At the time, professors and universities in Europe enjoyed great respect and prestige, as McCain reports hearing that “… professors are not human beings, they are gods” (McCain, 1960 p. 100). Accordingly, the then-present situation demanded excellence and performance without the need for situational manipulation. If we assume the subjects to be predominantly students, we are presented with a highly achievement-motivated sample privileged to study at a prestigious university.

We assume that, nowadays, experimental atmospheres have lost their performance-demanding qualities due to decreased experimenter-authority. Manipulation of the experimental atmosphere has become mandatory to provoke a Zeigarnik effect. Further, individuals capable of sufficient task involvement are needed. Herein lies the problem: Task involvement has become more difficult as we are increasingly faced with interruptions: Mobile phone notifications, e-mails at work, and a tendency for multitasking all impede our ability to focus on a task (Kushlev and Dunn, 2015; Ophir et al., 2009; Stothart et al., 2015). Subjects in Zeigarnik’s time may have been much more capable of engaging in a task. Finding such task-involved individuals in the present time should prove more difficult, which could contribute to the fact that today’s findings on the Zeigarnik effect do not reach the same magnitude as in Zeigarnik’s (1927) time.

Strikingly to this day the Zeigarnik effect is still freely cited, taken as a given, and used as an explanation for a plethora of research findings. A quick literature search on the Zeigarnik effect reveals numerous publications, rarely questioning the effects validity. Studies such as the review by MacLeod (2020) or the dissertation by Van Bergen (1968) have greatly contributed to questioning Zeigarnik’s findings. Nevertheless, their conclusions do not seem to have reached the masses sufficiently. Zeigarnik’s (1927) findings that interrupted actions are remembered better than completed actions seem highly intuitive, which may contribute to the popularity of the effect. However, it must again be emphasized that the findings simply cannot be replicated reliably, which was confirmed by the present findings and underlined with quantitative measures.

Modern theories of intentions have turned away from the abstract concept of tension toward a theory of activation (Goschke and Kuhl, 1993). The representation of an intention persists in a state of heightened subthreshold activation. Such activation ensures that the intention persists, prompting us to act upon it when the opportunity arises. Nevertheless, the observation remains that interrupted intentions are not simply forgotten but are reliably resumed and urge us toward their completion. Intentions must, therefore, take on a unique role in our memory. However, they do not always possess a conscious memory advantage compared to finished intentions in the presence of multiple tasks. As MacLeod puts it: “At best, it would appear to hinge on certain individual difference characteristics; at worst, it is simply not replicable” (MacLeod, 2020, S. 1081).

Summary

In conclusion, our meta-analysis indicates that the Ovsiankina effect may represent a general tendency. In contrast, the replicability of the Zeigarnik effect remains questionable and the supposed memory advantage for finished compared to interrupted tasks is certainly not universal.