Introduction

Working memory (WM) is a limited short term storage for temporary information retention and manipulation playing a critical role in multiple cognitive functions such as language comprehension, learning and reasoning1,2. The limited capacity of WM is a distinctive feature, that has been the subject of interest in cognitive research, traditionally suggested to hold a limited number of items simultaneously, ranging from four to seven, as is described by Miller, Cowan and Rouder et al.3,4,5. Simple and complex span tasks were primarily used in this field, and other paradigms, including change detection tasks, were introduced after6,7,8. An example of the change detection task was utilized by Luck and Vogel where participants were asked to memorize stimulus arrays consisting of 1–12 colored squares with different orientations9. After a short period, a test array appeared and participants were asked if the test pattern was the same as previously presented, or if any of the squares had changed in color or orientation. They found that the accuracy in recall was significantly higher when the number of memorized items were lower than 3–4. Evidence from these paradigms described these limitations in capacity using ‘the slot model’. This model conceptualizes WM capacity with a predefined number of slots available in an all-or-none format for information storage10.

The creative adjustments of change detection tasks and the introduction of more complex paradigms, such as analog recall paradigms, shifted the focus on the fidelity with which items are preserved instead of WM content. By introducing these paradigms with variable features and dimensions, it was suggested that the precision of WM varies based on the flexibility of resources allocated to it. If presented objects have a higher visual complexity, a fewer number of items are able to be maintained, giving rise to another mainstream theory: the ‘resource model’.11. This model proposes a dynamic allocation of resources to memorized items, while the slot model previously did not provide a quality estimation for resolution of recall12,13. The resource model is based on the foundation that measurements of stimuli in WM are inherently noisy and this noise increases as the number of stimuli increase, which will in turn lead to limitations in memory recall14.

Errors in recall can stem from different stages of data processing which show three prominent categories using delayed estimation techniques and change detection tasks: target errors, swap errors and random guessing. Swap errors occur when participant response is around a non-probed item while target errors are detected when a participant response is within a range close to a probed item in WM. Additionally, new findings suggest that swap errors may result from familiarity signals affecting noise levels as set sizes and delay intervals increase, leading to increased location noise around the non-target15. Conversely, random guessing is attributed to a failure in producing a response close to any of the target or non-target feature values with a uniform distribution14,16.

Recent evidence suggest that strict categorization of visual WM between slot and resource models are less reflective of experimental data and a stimulus-specific bias theory is more relevant17,18. For instance, as mentioned previously, stimulus characteristics, object structure, complexity, and overall scene structure, are what determine which model is a better fit19,20,21. The various paradigms, such as delayed match-to-sample (DMS) and analog recall tasks, each elucidate different characteristics and features of WM limits due to the distinct differences in their overall framework. For example, DMS tasks can interpret participant reactions based on correct or incorrect responses, assuming that either an item is fully maintained or forgotten without considering memory resolution. In contrast, analog recall tasks typically present a range of options for participants to choose from, assuming internal and external noise that influence memory recall. This raises the question of how these tasks are evaluating WM capacity and if there are fundamental differences in performance measures and precision.

With the goal to understand the correlation between WM precision and capacity, and the underlying similarities of the DMS and analog recall tasks, we conducted this study. Participants’ performances in the DMS task with checkerboard stimuli and sequential paradigm with bar stimuli were correlated revealing intrinsic association between the two measures of WM. The significant implication of our work can also be applied in clinical settings when working with patients that suffer from neurological diseases such as Alzheimer’s and multiple sclerosis that require cognitive follow-up. In these situations, content-based WM tasks can be administered which are easier to use and less time-consuming providing similar assessments.

Methods

Setting

Visual stimuli for task setup were generated with MATLAB software (MATLAB 2019a, The MathWorks, Inc, Natick, MA) and controlled by the Psychtoolbox 3 extension22. Participants sat in a dimly lit room with a distance of ~ 48 cm from a cathode ray tube monitor (CRT, 15″, refresh rate of 75 Hz). A total of 30 healthy volunteers (7 females, 26.56 ± 4.61 years old, from 21 to 37 years old) were recruited for this study and enrolled in two visual WM tasks: analog recall paradigm with sequential bar presentation and a DMS with checkerboard stimuli.

Sequential paradigm with bar stimuli

In the sequential task, each trial began with a central fixation point (0.26°) displayed for 2 s followed by presentation of a red, blue and green bar (pseudorandom order, 2.57° by 0.19°, Fig. 1A). The minimum angular difference between bars was 10 angular degrees and each bar was presented for 500 ms and there was a 500 ms delay (where a blank screen was displayed) between bars. Participants were instructed to memorize the orientation of each bar. After presentation of the last bar, a vertical probe bar (in red, blue, or green) was presented to the participants. They were asked to adjust the orientation of the probe bar to one of the previously displayed bars with the same color (target bar) using a computer mouse. By clicking on the right button of the computer mouse, to confirm their decision, they received visual feedback showing the correct orientation of the target bar, their response, and the angular difference between their answer and the bar in question. We recorded the orientation of presented bars, participant’s response, and recall error (angular difference between target angle and participant response for each trial). We collected data from 6 blocks, each consisting of 30 trials (i.e., 180 trials per person). A low memory load task was also implemented which consisted of a single bar stimulus instead of three. Each participant performed 30 trials with one bar stimulus in the analog report task (instead of three) and bar orientation, participant response, and recall error was recorded. Before beginning the main task, a 10-trial training block was used to familiarize the participants with the procedure.

Fig. 1
Fig. 1
Full size image

Schematic design of Working Memory (WM) tasks. (A) In the analog recall paradigm with sequential bar presentation, participants were asked to memorize bar orientations of three consecutively presented bars. After a 1s delay interval they were asked to match the probe bar to the angle of one of the previously presented bars with the same color. (B) For the Delayed Match-to-Sample (DMS) task with checkerboard stimuli, participants were asked to memorize a checkerboard pattern and after a random delay interval of 0.5, 1, 2, 4 or 8 s they were asked to select the correct pattern previously presented between two different checkerboard patterns.

Delayed match-to-sample task (DMS)

For the DMS task, participants were asked to look at a central fixation point (0.26°) for 2 s. After that a 4 by 4 checkerboard (4.83° × 4.83°) was presented to the participants (Fig. 1B). On each trial, 6 to 10 squares (out of 16) were yellow and the rest were red. Participants were asked to memorize the pattern. After a delay of 0.5, 1, 2, 4, or 8 s, where only the fixation point was displayed, participants were presented with two checkerboards. One was the same as the previously shown pattern (sample probe) and the other had one yellow square swapped randomly with a red square. These two checkerboards were presented on the right and left side of the screen (3° from each other). Participants were asked to report whether the right or the left checkerboard was the same as the sample stimulus by pressing the right or left arrow keys on the computer keyboard. The test trial patterns had always exactly the same number of red/yellow squares as the sample checkerboard and only the location of a single square was modified during each trial, not the color of the squares. Test checkerboards were kept on the screen until participant response or when the 4 s window ended. Visual feedbacks were given using a green disc (1.93°) for a correct, red for an incorrect response, or a blue if no response was given within 4 s. Each block consisted of 30 trials (5 repetitions for each delay) and each participant was enrolled in 6 trial blocks in total (i.e., 180 trials). Before beginning the main task, a 10-trial training block similar to the original test was used to familiarize the participant with the paradigm.

Statistical analysis

The total sample size required for this study was calculated using GPower 3.1, aiming to achieve a moderate level of correlation (ρ) of 0.5 for the bivariate model, with an α error probability of 0.05 and desired power (1 − β err prob) of 80%23. The required sample size was expected to be 29 participants.

The statistical analyses were done with MATLAB 2023a. An analog report MATLAB toolbox from Bays lab was implemented calculating the circular mean for recall error for each participant12. Precision, defined as the reciprocal of circular standard deviation of recall error was calculated for each individual. To investigate the sources of error in recalling information, the three-component Mixture Model was implemented. Using this model, the target proportion (Gaussian variability in reporting the orientation around the target bar), non-target proportion (swap error, Gaussian variability in misreporting the orientation around non-target bars) and, uniform proportion (random guessing) were calculated for each participant. Performance in the checkerboard task was defined as the ratio of correct responses to all given responses.

To calculate the correlation between tasks, we used Spearman’s correlation coefficient between performance in the DMS, and recall error and precision of the sequential task, separately. We also measured the correlation (Spearman’s) between the performance in DMS and the three sources of error that we obtained from the three-component Mixture Model. Furthermore, the correlation analyses between performance from each delay period (0.5, 1, 2, 4, and 8 s) with recall error, precision, and target, non-target, and uniform proportions of bar-specific responses were separately calculated. Correction for multiple comparisons was done using the Bonferroni method to calculate adjusted p-values.

Written informed consent was obtained from all participants. This study followed the latest update of the Declaration of Helsinki and was approved by the Iranian National Committee of Ethics in Biomedical Research (Approval ID: IR.MUI.MED.REC.1400.441).

Results

A total of 30 healthy participants (7 females, 26.56 ± 4.61 years old, from 21 to 37) participated in the analog recall task with sequential bar presentation and DMS task with checkerboard stimuli which is demonstrated in Fig. 1A,B. By conducting a Spearman’s correlation analysis, we found that participants’ performance in the DMS task was negatively and significantly correlated with the mean recall error in the sequential task (r = − 0.60, p < 0.003, Fig. 2A). We also found that performance in the DMS task was positively correlated with the precision of the sequential task (r = 0.60, p < 0.003, Fig. 2B). Furthermore, we found that while performance of the DMS task was positively correlated with target proportion (r = 0.59, p < 0.003, Fig. 2C), it was negatively correlated with non-target proportion (swap error, r = − 0.55, p < 0.01, Fig. 2D). Performance did not show a significant correlation with uniform proportion (r = − 0.22, p = 1, Fig. 2E).

Fig. 2
Fig. 2
Full size image

Correlations between parameters in analog recall paradigm with performance of DMS task. Correlation between (A) recall error, (B) precision, (C) target, (D) non-target (swap error), and (E) uniform proportions with DMS performance (ratio of correct response). Rho and p value of Spearman’s correlation are provided above each subplot.

A similar pattern was observed when we calculated the correlation between performance in the DMS task for each delay, with recall error, precision, target proportion, non-target proportion, and uniform proportion for each bar order (Fig. 3A–E). We found that the performance in the DMS task was more correlated with the recall error of the third presented bar (r = − 0.59, p < 0.003) than the second (r = − 0.56, p < 0.008), and first bar (r = − 0.57, p < 0.006, Fig. 3A). This pattern was observed for precision only in the third bar at a statistically significant level (3rd: r = 0.7, p < 0.001; 2nd: r = 0.46, p = 0.055; 1st: r = 0.45, p = 0.06, Fig. 3B). The correlation between DMS performance and sources of error is presented in Fig. 3C–E.

Fig. 3
Fig. 3
Full size image

Correlation between bar orders in the sequential paradigm vs. delay intervals in DMS. Heatmap of Spearman’s correlation coefficient values between (A) recall error, (B) precision, (C) target, (D) non-target, and (E) uniform proportions, from bars 1 to 3 with checkerboard performance from five distinct delay periods (0.5, 1, 2, 4, and 8 s). Asterisk shows significant correlation (p < 0.05), while orange shades represent positive and blue shades represent negative correlations.

A separate analog paradigm with one bar stimuli was included to evaluate the correlational relationship with the DMS task to understand whether the observed effects are due to task difficulty. A significant correlation was observed between recall error and precision with the overall performance of the DMS task (r = − 0.56, p < 0.005, r = 0.38, p < 0.05, respectively) which confirms that the effects are not due to task difficulty.

Discussion

The search for a comprehensive model explaining behavioral data from WM tasks has led to the emergence of two prominent schools of thought: the slot-based model and the resource-based model. While each model possesses unique properties capable of explaining various observation patterns in WM assessment, the discrepancies between them have been a subject of debate. Although recent studies have introduced hybrid theories, such as the categorical resource model, which incorporate features from both traditional models, direct correlative assessments have not been clearly implemented to clarify the differences of tasks associated with these theories18,24. Our correlational study aimed to unveil similarities and differences between these two mainstream tasks commonly utilized in the development of the slot and resource models which will be discussed in comparison to available models.

The moderate to high correlation observed between recall error and precision with DMS performance, particularly noticeable in the third bar (the least challenging to memorize due to the shortest delay interval), suggests that when the memory load is lower in the sequential task, the results are more correlated to the less challenging task (i.e., DMS). This is complementary to a study by Zokaei et al., which compared digit span measures with a precision task and found significant correlations between performance in backward digit span (a more difficult slot-based model task) and precision in a resource model task25. In addition, our study showed the correlation pattern between these two tasks with higher temporal resolutions in DMS (from 0.5 to 8 s) and sequential tasks (1–3 bars). These analyses showed no significant correlation in the specific delay periods (Fig. 3, explained below), emphasizing the importance of considering the full range of delay intervals.

The absence of a significant correlation between uniform proportion and performance in the DMS task suggests that binary paradigms cannot be used to study this type of error. The three-component Mixture Model, introduced by Bays et al., categorizes error patterns into target, non-target, and uniform error (proportion). However, with the incorporation of the neural resource model (Stochastic sampling), the uniform error seems less relevant26,27.

The variation of correlation significance observed in our work is also in line with a 2020 comparative analysis evaluating three visual WM tasks with distinct properties involving different types of stimuli and presentation formats. This analysis critiqued the relevance of three prominent models (pure slot model, pure resource mode, and hybrid models) in explaining WM capacity21. While the slot-based model lacks the capacity to represent variability in memory resolution, continuous resource models assume internal and external noise according to signal detection theory. Although their findings supported the pure resource model, regardless of task type, the joint model analysis showed performance in two tasks cannot be described with a single estimate of capacity or resource; the best-fit model varies depending on simultaneous or sequential stimuli presentation. Resource distribution for information encoding and maintenance depends on information content and encoding conditions. Studies supportive of the discrete slot model have often overlooked base rate manipulation, set size variations (i.e., number of items asked to be memorized) and response bias (tendency to endorse a specific response). Therefore, the observed variability in correlation coefficients in specific delay intervals with recall error and precision could be attributed to different experimental settings in our study.

Another example of variation in model fitting includes a study that compared the variable precision model in comparison to hybrid models such as the slot-averaging, slot resource and equal precision models to see which best fits the data from a change detection task28. The results showed the best fit for the variable precision; with improved fits also seen for the slot-averaging and slot-resource models compared to the infinite precision item limit model and the equal precision model. Despite the variable precision model’s inability to distinguish between random guessing and low-precision memory recall, another study found notable results. The study used a whole report memory task, where participants had to recall all items from each trial. Under high memory load conditions, a large proportion of participant responses were guesses29. When participants were free to report the items in their preferred order, a decline in memory performance was seen after the third item due to uniform error distribution caused by guessing and not low-precision representations. These results support the theory that WM has a limited number of slots for information storage suggesting that memory precision and capacity may be distinct aspects of WM.

According to a review by Bays et al., the slot versus resource model debate has evolved significantly. The simple primary version of the slot model has been abandoned and replaced by versions that account for the noisy nature of visual data encoding and the impact of memory load on memory functionality30. They conclude that the inclusion of continuous variations in slot-based models and specific assumptions of complete memory failure is excessive, sparking new discussions, especially when participants are incentivized to remember items. Improved precision when rewards are included explain variation in performance using the resource-rational theory30,31. The drop in performance is perhaps not cost effective due to neural spiking which is a new take on how limited resources of memory are distributed.

To avoid any potential confound, our inclusion criteria were limited to individuals younger than 40 years old32,33. However, this study had limitations which could be addressed by conducting future EEG and functional MRI studies, along with behavioral paradigms, to distinguish between different models and pathways involved. In addition, data simulation based on the slot and resource models in comparison to experimental findings could also give a clearer outlook on classical approaches. Additionally, future work could focus on implementing model-based predictions to provide a clearer understanding of how the slot model and resource model specifically apply to the tasks used in similar studies. By doing so, we can better elucidate the theoretical implications and validate the predictions derived from each model. To enhance the understanding of task-specific model associations, future studies should incorporate tasks with a diverse array of settings to assess their effects. For instance, previous research has shown that the correlation between array change-detection tasks and multi-change detection tasks with different span measures was only evident when the multi-change detection tasks involved a greater number of items than the capacity limit. This correlation was specifically observed in experiments with a 7-item array, rather than a 5-item array. By implementing changes to task arrays as well as reward implementations to assess the effects of incentivization on memory performance in future research, we can expect to obtain more robust findings30,34.

In conclusion, our study revealed a significant correlation between the DMS and analog recall paradigms, determining that the DMS tasks are not necessarily less capable of estimating WM capacity. Our results can serve as a confirmatory approach when dealing with larger sample sizes and limited time, allowing reliance on classical approaches which are easier to access and perform. For example, when assessing disease progression and cognitive function in patients who suffer from conditions that cause cognitive impairments such as Alzheimer’s, dementia or multiple sclerosis, affecting WM, we can use a simpler DMS task that is easier to use in practice, compared to a more time-consuming one (analog recall) to achieve similar results.