Introduction

Imagine reading an intriguing scientific paper, a fresh cup of coffee percolating on the desk, when suddenly, a text notification appears in the upper-right-hand side of your computer screen. Despite a deep interest in the paper, your attention is automatically pulled towards the alert without your eyes leaving the middle of the screen. In everyday life, the visual system is bombarded by stimuli that induce such reflexive, spatially-specific shifts of attention without accompanying eye movements, a phenomenon dubbed covert exogenous spatial attention.

Laboratory studies of covert exogenous spatial attention have taught us a great deal. We know how to elicit and measure the effects of exogenous attention, and we have an extensive understanding of the impact that exogenous attention has on perception and on the neurophysiology of the visual system1. In a typical laboratory experiment, a task-irrelevant shape (or a “cue”) is flashed briefly in the visual periphery, followed by a target stimulus at one of several possible isoeccentric locations. Observers then typically make a two-alternative forced-choice judgment about the peripheral target (e.g., orientation discrimination), while keeping the eyes fixed on the screen center. The peripheral cues typically provide no information about the target’s location. Nonetheless, task accuracy increases and response time (RT) decreases when the cue appears near the subsequent target location, compared to when the cue appears at another location. These attentional effects are rapid and transient, peaking within ~ 100 ms of the cue’s onset2. Furthermore, the allocation of covert exogenous spatial attention has been shown to increase the rate of information accrual and modulate perceptual sensitivity across a variety of taskse.g.,3,4,5, with effects persisting even when voluntary attention is covertly deployed to another spatial location6.

But let’s build upon our real-world example above. Imagine you have an impending grant deadline and are working in the middle of the night when text notifications are rare.

How might such associations between time-of-day and notification probability impact the efficacy with which digital onsets reflexively drive the allocation of spatial attention? This is the question we explore in the current manuscript. Specifically, we asked: are the behavioral effects of an irrelevant exogenous cue modulated by implicit knowledge about the probability of a cue’s appearance? To answer this question, we measure the automatic impacts on accuracy and RT caused by task-irrelevant exogenous cues that are preceded by task-irrelevant contextual information about the cue’s probability of occurrence. Across both low-probability and high-probability contexts, the cues are physically identical, as are the target stimuli.

Ample empirical work has established that expectation modulates the visual processing of otherwise identical stimulifor reviews, see:7,8,9,10. Relevant for our research question, different theories within this literature make opposite predictions about whether high-probability or low-probability exogenous cues would produce stronger cueing effects11. On the one hand, Bayesian models suggest that perception is dominated by expected eventse.g.,12,13,14. A strong prior for a brief, task-irrelevant peripheral cue would strengthen its perceptual effects, with this framework suggesting that high-probability cues would produce stronger attention effects than low-probability cues. On the other hand, cancellation theories posit that unexpected stimuli elicit strong “prediction errors,” with expected events being relatively suppressede.g.,15,16. Thus, this alternate account suggests that high-probability exogenous cues should elicit weaker attention effects than low-probability exogenous cues.

Another related literature on “distractor suppression” has shown that observers can use prior knowledge about task-irrelevant stimuli to mitigate deleterious effects on task performance17,18. For instance, salient but irrelevant abrupt-onset stimuli have less distracting effects when the observer knows in advance their likely features or location19,20,21. The key idea is that a distracting stimulus is less able to capture attention when observers know something about it in advance. However, in that literature, observers learn about the distractor because its properties repeat consistently across trials, with experimental manipulations occurring between observers or between experimental blocks. In contrast, here we deliver information about the probability of a peripheral exogenous cue via an implicit context signal that varies randomly from trial to trial. In all cases, high-probability and low-probability cues are physically identical. Thus, any information that could be used to engage other forms of attention (e.g., feature-based attention, covert endogenous spatial attention) is present in both probability contexts. To the best of our knowledge, it is currently unknown whether such trial-to-trial, cue-probability information would modulate the allocation of covert exogenous spatial attention.

To experimentally manipulate information about the prior probability of an exogenous cue’s appearance, we took inspiration from methodological innovations originating in the study of value-driven attentional capture, or VDAC22. The VDAC literature has shown that robust associations between the color of a visual search target and the probability of high or low monetary reward can be formed in as few as 240 trials. We reasoned that such color-based associative learning might be an effective way to deliver information about the prior probability of the exogenous cue’s appearance on each trial, so we developed a protocol that uses such an approach (Figs. 1, 3). At the start of each trial, two small squares at fixation turn red or green (each color equally likely). These squares are the “context signal.” If they turn red, the exogenous cue (a peripheral disk) appears with probability 0.8. If the fixation squares turn green, the probability of cue appearance is 0.2. For half the participants, the color contingencies are reversed. Importantly, observers are not explicitly informed about the association between the context signal’s color and the probability of the exogenous cue appearing. In fact, post-experiment questionnaires indicated that observers were not aware of the information provided by the context signal, and thus, any expectations were implicit (Table 2).

Fig. 1
figure 1

Trial sequence, Experiment 1. s.f., spatial frequency; d.v.a., degrees of visual angle.

If the exogenous cue is presented, it is equally likely to appear on the left or right side of fixation. 100 ms after cue onset (in trials that present a cue), two oriented stimuli (Gabor gratings or lines) are presented, one on the left and one on the right side. These two stimuli are randomly and independently rotated from vertical. The observer’s task is to report the orientation of the target (clockwise, CW, or counterclockwise, CCW), which is indicated by a post-cue (a simple line element) on one side of fixation.

This protocol manipulates information about onset probability for physically identical exogenous cues. Furthermore, the cues themselves are task-irrelevant and provide no spatial information about the task-relevant target. Thus, any difference in the magnitudes of the cueing effects must be attributed to the information signaled by the color of the context signal.

To preview our results, we found that delivering information about the prior probability of an exogenous cue’s appearance modulated its ability to engender shifts in covert exogenous spatial attention, but only when the time for visual information accrual was limited. In an effort to characterize the temporal dynamics of any observed effects, Experiment 1 was conducted in-person and used a well-validated speed-accuracy trade-off (SAT) procedure5,6,23, which required participants to respond at seven distinct response delays. Findings from this initial experiment indicated that low-probability cues produced significantly larger cueing effects than did high-probability cues, but only in the first response delay condition, where responses were required within 533 ms. Additional online experiments took a between-subjects approach with respect to the temporal dynamics, isolating responses to a particular moment in the accumulation of visual information. These experiments corroborated the results of Experiment 1: low-probability cues produced significantly greater cueing effects than did high-probability cues, but only when responses were made within 500 ms of stimulus offset.

Results

Experiment 1: In-person SAT study with 7 unique response delays

Experiment 1 was conducted in-person and employed a speed-accuracy trade-off (SAT) procedure. We required participants to respond during a fixed, 500 ms window, which was open at seven different response delays (33–2133 ms after the offset of the target and distractor gratings). With a response delay of 33 ms, participants had only 533 ms from the offset of the target to process the visual information and make a perceptual decision. Requiring responses in such a short amount of time leads to many errors. With a response delay of 2133 ms, participants are required to wait before making a response, allowing time for the full accumulation of visual information. Such ample time engenders asymptotic performance. The response delays in between cover the intervening stages of visual processing.

Seven participants completed 18–21 experimental sessions each, resulting in a total of 112,168 trials. Of these, the 500-ms response window was missed 3,985 times (3.55%), resulting in a total of 108,183 trials with responses. Exogenous cues were presented on approximately half the trials, in line with the prior probabilities outlined in Fig. 1 (0.2 on low-probability trials, 0.8 on high-probability trials), resulting in a total of 54,415 “cue-present” orientation discrimination judgments.

A summary of the results is presented in Fig. 2. Panels A and B confirm the successful measurement of the temporal dynamics of a speed-accuracy tradeoff (SAT): task accuracy rose exponentially as a function of response time and then plateaued24. Furthermore, accuracy on valid trials (the black dots) is consistently higher than on invalid trials (the white dots), confirming the successful manipulation of spatial attention. This exogenous attentional effect is present for both low-probability and high-probability onsets at all response delay conditions. Ideally, we would fit these data with the kind of SAT function we have used in previous work6 and estimate the rate of information accrual and asymptotic discriminability in each cue-validity and cue-onset-probability condition. But because all four functions start around ~ 70% correct, we cannot estimate when these functions depart from chance, which would be required to fit SAT functions in the standard way. That said, we can still address our primary research questions: (1) Does the magnitude of the exogenous cueing effect differ between low- and high-probability cues? (2) Does the nature of that attention-by-expectation interaction change as perceptual decision-making unfolds? We answer those questions by computing the cue-validity X cue-onset-probability interaction term at each response delay condition (Fig. 2E). We used strict Bonferroni correction of p-values for these 7 statistical tests.

Fig. 2
figure 2

Results, Experiment 1. (A) Low-probability (LP) cues. Accuracy (proportion correct, mean across participants) is plotted as a function of response time (across-subject mean of the mean correct response times in each response delay condition). Black circles, valid cues. White circles, invalid cues. (B) High-probability (HP) cues. Same format as Panel A. (C) Spatial attention effect, LP cues. The y-axis is the mean within-participant change in accuracy (proportion correct), valid minus invalid, with a bootstrapped 95% confidence interval shown for each response delay condition. The x-axis is on a categorical scale to aid in visual assessment. Each range indicates the bounds of possible response times for that delay condition (delay amount + 500 ms response window). (D) Spatial attention effect, HP cues. Same format as Panel C. (E) Cue validity x cue probability interaction. The y-axis is the mean within-participant change in the spatial attention effect, LP minus HP. Same format as Panel C. *p = 0.0015, with a critical p-value of 0.0071 after Bonferroni correction.

In short, we found that low-probability cues generated significantly larger cueing effects (Δ proportion correct, valid vs. invalid trials) than did high-probability cues, but only at the first response delay condition: mean ΔΔ proportion correct = 0.0392, t(6) = 5.53, two-tailed p = 0.0015 (corrected p = 0.0103), bootstrapped 95% CI = [0.0269–0.0524]. At all other response delays, 95% CIs include zero and all ps > 0.1733.

To follow up on this cue-validity X cue-onset-probability interaction, we assessed the cue-validity effect at the first delay condition, separately for each cue-onset-probability. We found a significant cueing effect for low-probability cues (mean Δ proportion correct = 0.0510, t(6) = 3.75, two-tailed p = 0.0095), but not high-probability cues (mean Δ proportion correct = 0.0118, t(6) = 1.04, two-tailed p = 0.3372). To better understand which cue-validity condition might be driving the interaction, we then compared the impact of cue-onset-probability separately for each cue-validity condition. We found a marginally significant effect of onset probability for valid cues (mean Δ proportion correct = 0.0276, t(6) = 2.15, two-tailed p = 0.0753), but no evidence for an impact of onset probability for invalid cues (mean Δ proportion correct = − 0.0117, t(6) =  − 1.08, two-tailed p = 0.3214).

Thus, this experiment provides preliminary evidence that peripheral exogenous cues have larger effects when they are unexpected than when they are expected, but that effect is evident only in the early accumulation of perceptual evidence. The justification for separately analyzing each response delay in this experiment is that we then followed up with 5 additional online experiments with larger sample sizes, targeting the response delays that Experiment 1 suggested are most informative.

Experiment 2: online study with a 500 ms response deadline

Experiment 2 used a response deadline of 500 ms, which was designed to mimic the first delay condition in Experiment 1. To this aim, a total of 320 participants were recruited via Prolific and completed a “screening” session that contained 16 practice and 960 experimental trials of the the following task: Participants reported the orientation (CW or CCW of vertical) of a simple white line that was preceded, on some trials, by a task-irrelevant cue (Fig. 3). As in Experiment 1, a color presented at fixation (red or green) signaled the probability (0.8 or 0.2, counterbalanced across participants) that a cue would appear near one of two potential target locations. When presented, this cue flashed briefly, either near the target line (valid trial) or near a distractor line (invalid trial) with equal probability.

Fig. 3
figure 3

Trial sequence, Experiment 2. See text for additional details.

On a series of instructional slides, we communicated the following regarding response times: “Once the fixation color changes to white, you will have 0.5 s to enter your response. In this experiment, when you respond is very important. You should respond as accurately as possible, but prioritize getting in your response during the 0.5 s response window.” Regarding the exogenous cue, participants were told: “On some trials, you may notice small dots that flash somewhere on the screen. These flashes are irrelevant and will not provide any information about where the target line will appear. Do your best to ignore them.” Using pre-determined criteria, which were communicated in the consent document and the session instructions, participants whose task accuracy exceeded 60% correct and whose missed responses were less than 10% were invited to complete two additional sessions. These sessions were identical to the screening session but contained no practice trials.

Ninety-six participants met our inclusion criterion in the screening session. One participant completed one additional session and 91 participants completed two additional sessions, resulting in a total of 267,840 trials (Note: we will have more to say in the Discussion section about the 70% of participants who did not pass the screening session). Of these, the 500 ms response window was missed 9533 times (3.56%), resulting in a total of 258,307 trials with responses. Peripheral cues were presented in line with the prior probabilities outlined above (0.2 for low-probability cues, 0.8 for high-probability cues), resulting in a total of 129,473 “cue-present” orientation discrimination judgments.

Figures 4 and 5 present orientation discrimination accuracy and response time data. The results of a 2 × 2 ANOVA with cue-validity and cue-onset-probability as repeated-measures factors are presented in Table 1. Consistent with Experiment 1’s first delay condition, we observed evidence for a cue-validity X cue-onset-probability interaction: low-probability cues engendered a significantly larger cueing effect, relative to high-probability cues. Regarding the RT data, the main effect of cue-validity and the lack of an interaction alleviate concerns about trade-offs with the RT domain that could complicate the accuracy-based results.

Fig. 4
figure 4

Individual results, Experiment 2. (AF) Each dot represents an individual participant. (AC) proportion correct. (DF) response time.

Fig. 5
figure 5

Group Results, Experiment 2. (Top row) Accuracy and response time, mean across participants, for each condition. (Middle row) Cue-validity effect (valid vs. invalid), mean across participants with a bootstrapped 95% error bar. (Bottom row) Cue-validity X cue-onset-probability interaction effect, mean across participants with a bootstrapped 95% error bar.

Table 1 Statistics, Experiment 2. 2 × 2 ANOVA with cue-validity and cue-onset-probability as repeated-measures factors.

To follow up on the cue-validity X cue-onset-probability interaction in Experiment 2, we assessed the cue-validity effect separately for each cue-onset-probability condition. Consistent with Experiment 1, we found a significant cueing effect for low-probability cues (mean Δ proportion correct = 0.0582, t(95) = 8.18, two-tailed p < 0.0001). Inconsistent with Experiment 1, however, we found a smaller, but still statistically significant, cueing effect for high-probability cues (mean Δ proportion correct = 0.0452, t(95) = 7.77, two-tailed p < 0.0001). To better understand which cue-validity condition might be driving the interaction, we then compared the impact of cue-onset-probability separately for each cue-validity condition. We found a significant effect of onset probability for valid cues (mean Δ proportion correct = 0.0102, t(95) = 2.77, two-tailed p = 0.0068) and no evidence for an impact of onset probability for invalid cues (mean Δ proportion correct = − 0.0029, t(95) =  − 0.63, two-tailed p = 0.5299), both of which are consistent with Experiment 1.

If this cue-validity X cue-onset-probability interaction is truly contingent on responses being made under strict time pressure (i.e., early in the accumulation of visual information), then we might expect the cueing effect difference to be larger when responses were faster. Given the size of the Experiment 2 dataset, we were able to split the data into two bins, based on the midpoint of the 500 ms response deadline: RTs <  = 250 ms (faster bin) and RTs > 250 ms (slower bin). We ran a 2 × 2 × 2 ANOVA with cue-validity, cue-onset-probability, and RT bin as repeated-measures factors, and found that there was a significant three-way interaction (F(1,95) = 4.25, p = 0.042). As expected, the cueing effect difference found in the faster bin was significantly larger than the cueing effect difference in the slower bin. See Supplemental Results, Experiment 2 Binning Analysis for additional follow-up analyses.

Signal-detection-theoretic Bayesian Modeling

We acknowledge that using proportion correct to summarize the data in this particular experiment may be problematic. One reason might be the bounded nature of the proportion correct scale. Using a different sensitivity measure, like dʹ from signal detection theory, is a common strategy to overcome this concern. But even if we were to compute dʹ using hit and false alarm rates calculated from the trial-level data, we would still be left with two additional concerns: (1) Some participants had fewer sessions than others but are contributing to the group mean with no reflection of this difference in data quantity. (2) The number of trials containing high-probability cues far exceeded the number of trials containing low-probability cues, and a direct comparison of proportion correct values fails to capture this difference in the amount of available data.

To simultaneously address all of these concerns, we analyzed the data from Experiment 2 using a Bayesian, signal-detection theoretic model (full model specifications presented below, Model 1). On each trial, a participant makes a 2-alternative, forced-choice judgment of the target grating’s orientation (clockwise or counterclockwise of vertical, CW and CCW, respectively). Successive binary outcomes of this kind are well-described by a binomial distribution. In our experiment, we can model the number of CCW responses (numCCW, Model 1, Line 1) as N draws from a binomial distribution with probability p, where N is the number of trials (numTrials, Model 1, Line 1). It should make intuitive sense that the probability of a CCW judgment depends on a number of distinct factors, the most obvious being the orientation of the target. This is why there are separate lines for p when the target was CCW (Model 1, line 2) and when the target was CW (Model 1, line 3). Lines 2 and 3 present the standard mathematical notation for a signal-detection theoretic model25, with d representing perceptual sensitivity to CCW orientations and b representing decision criterion.

We want to understand how d changes as a function of the validity of the cue, the prior probability that the cue will appear, and the interaction of these two factors. To do so, we define d using a linear model that contains two “dummy” predictors: isValid (1 = valid cue, 0 = invalid cue) and isLowProbability (1 = low-probability cue, 0 = high-probability cue), written as isV and isLP for compactness (Model 1, line 4). For each individual, i, four parameters are estimated δ0i, δ1i, δ2i, and δ3i, allowing for the computation of d in each experimental condition (V_LP = δ0 + δ1 + δ2 + δ3. I_LP = δ0 + δ2. V_HP = δ0 + δ1. I_HP = δ0). Thus, this model has both random intercepts and random slopes. We also estimate a single b parameter (Model 1, line 5) for each individual participant, i.

We assign all of the parameters uninformative priors (Model 1, lines 6–10), which allows for a wide range of baseline sensitivities and attentional modulation in either direction. The mean of the δ0 prior distribution is set to 1.35, as this value of dʹ equates to a hit rate of 0.75 and a false alarm rate of 0.25 (i.e., halfway between chance and ceiling, with a neutral criterion).

Model 1:

$$\begin{array}{ll}& {{1:numCCW}}\sim {{Binomial}}({{numTrials}},p)\\ & {2: p}_{wasCCW}=\Phi (d/2-b)\\ & {3: p}_{wasCW}=\Phi (-d/2-b)\\ & {4: d}={\delta }_{0i}+{\delta }_{1i}\times {{isV}}+{\delta }_{2i}\times {{isLP}}+{\delta }_{3i}\times {{isV}}\times {{isLP}}\\ & {5: b}={\beta }_{i}\\ {Prior Distributions:}& \\ & {6: \delta }_{0i}\sim \mathcal{N}({1.35,1})\\ & {7: \delta }_{1i}\sim \mathcal{N}({0,1})\\ & {8: \delta }_{2i}\sim \mathcal{N}({0,1})\\ & {9: \delta }_{3i}\sim \mathcal{N}({0,1})\\ & {10: \beta }_{i}\sim \mathcal{N}({0,1})\end{array}$$

The prior distributions reflect our beliefs about the impact of cue-validity, cue-onset-probability, and their interaction before seeing the experimental evidence. The posterior distributions reflect our updated beliefs about each parameter, after incorporating the evidence from the experiment. To estimate the posterior distributions of each parameter in the model, we used the “rethinking” package in R26, which implements Hamiltonian Monte Carlo. We then used the posterior distributions to compute d for each experimental condition, for each individual. Lastly, we computed the cue-validity effect (valid minus invalid) for each cue-onset-probability condition and the interaction term (low-probability cueing effect minus high-probability cueing effect), for each individual. To estimate these effects at the group level, we took the mean across individuals and calculated the 95% credible intervals of the resulting distributions (Fig. 6).

Fig. 6
figure 6

Posterior distributions, Model 1. Posterior density distributions for each cue-validity effect (low-probability cues in green, high-probability cues in red) and the cue-validity x cue-onset-probability interaction (in blue). The peak value and the 95% highest posterior density interval (HPDI) of each distribution are provided in each colored box. The dotted lines indicate the 95% HPDI for the interaction posterior distribution. The dashed line indicates the mean interaction term computed directly from the data. The solid line marks zero as an x-axis reference point.

The results of the Bayesian model are consistent with those from the 2 × 2 repeated-measures ANOVA on proportion correct data. The 95% HPDI of the interaction distribution does not contain zero, and 98.46% of its values exceed zero. Note that the posterior distribution for low-probability cues is visibly wider than posterior distribution for high-probability cues. This difference in uncertainty, driven by the difference in the number of trials in each condition, is exactly what we hoped to incorporate by using a trial-level analysis.

Additional online studies

We ran four additional online studies to confirm that the cue-validity X cue-onset-probability interaction was not present at longer response times. Experiment 3 allowed for an additional 200 ms of information accrual and decision time relative to Experiment 2, requiring responses within 700 ms. Experiment 4 allowed participants to respond freely, balancing speed and accuracy as they wished. The final two experiments used a response delay, requiring participants to wait at least 1067 ms before responding. In Experiment 5, this delay was unenforced. If a participant pressed a response key during the delay period, there were no consequences, but the button would have to be pressed again during the response window. In Experiment 6, the response delay was enforced. If a response key was pressed during the delay period, the trial ended immediately and participants were shown a 1500 ms reminder that they must wait to make a response. Proportion correct data was analyzed via a 2 × 2 ANOVA with cue-validity and cue-onset-probability as repeated-measures factors, separately for each experiment.

In sum, each of these experiments showed a main effect of cue validity in the accuracy domain (all ps < 0.0001), but no main effect of cue-onset-probability (all ps > 0.174) nor cue-validity X cue-onset-probability interaction (all ps > 0.426). See Supplemental Results, Additional Online Experiments for full reporting. In sum, these findings confirm that the onset-probability modulation of covert exogenous spatial attention occurs only in the early stages of perceptual evidence accumulation, as revealed in Experiments 1 and 2 when participants were required to respond very quickly.

Assessing awareness of experimental contingencies

Post-experiment questionnaires demonstrated that any expectations for the task-irrelevant cues were formed implicitly. At the end of the final session, we asked participants in four of the online studies three questions, which were presented separately and answered in succession. Despite high rates of noticing that the fixation color was variable (> 80% in all experiments), substantially fewer participants reported noticing that the color of the fixation circle predicted other events in the trial (< 16% in all experiments). When asked to identify the experimental manipulation, the percentage of correct responses was below chance accuracy (25%), in all four groups of participants (range, 14.63–22.22%). Lastly, the percentage of participants who reported noticing that the color of the fixation circle predicted other events and who correctly identified the color-onset contingency was remarkably small (< 3% in all experiments). Thus, these data suggest that participants were not consciously aware of the relationship between the color of the fixation circle and the probability of cue onset (Table 2).

Table 2 Evidence for implicit expectations.

Discussion

This project makes several contributions to the literature. First, it introduces a novel method for manipulating and measuring the impact of implicit knowledge about task-irrelevant events on covert exogenous spatial attention. This method was successfully deployed in the lab and online with a global participant base. Second, this project provides evidence that low-probability exogenous cues elicit a stronger reflexive shift of spatial attention than high-probability exogenous cues. This strengthening of the cueing effect is transient, occurring only when the behavioral response is made under strict time pressure. The temporal component to this interaction between exogenous attention and cue probability information is especially important; it highlights that future studies should pay careful attention to the experimental control of response times, so as to avoid protracted debates over conflicting findings.

To the best of our knowledge, no other study has manipulated the prior probability of a task-irrelevant exogenous cue’s appearance, on a trial-to-trial basis via implicit associative learning, and measured its effect on covert exogenous spatial attention. That said, this project may bring to mind a large literature on “distractor suppression.” Within this literature, prolonged experience or history with an abrupt-onset’s features or location has been found to reduce the distracting effect such an onset hasabrupt-onset color singletons:19,simple luminance flashes:20,21. However, we note three important distinctions between our approach and traditional distractor suppression paradigms. First, the task in many papers about distractor suppression is visual search, which requires and encourages overt attentional shifts (eye movements). We set out to study covert exogenous spatial attention, a distinct process without accompanying eye movements1. Second, some papers in the distractor suppression literature do focus on covert processes, but with task designs that allow for the engagement of other forms of attention. For example, a precue that provides exact spatial information about the upcoming target’s location facilitates voluntary attentional allocatione.g.,27. Likewise, a target that differs in color from an abrupt-onset distractor promotes the use of feature-based attention to boost the former and suppress the lattere.g.,28. Neither voluntary attentional strategy to suppress the effects of the “distractor” (in this case the exogenous cue) would be possible in our task. Third, these studies employed either a between-participant or blocked design, so that participants could learn about the distractor from exact stimulus repetitions over many consecutive trials. In contrast, our protocol interweaves high-probability and low-probability cues in a mixed trial design, and the information provided by the context signal was implicit and task-irrelevant. Thus, our results differ in significant ways from previous phenomena labeled “distractor suppression.” But they are consistent with the general notion that information about irrelevant stimuli can modulate their attentional effects. Again, it is important to note that we found this modulation to be conditional on responses being made during the early accumulation of visual information.

As discussed by Press and colleagues11, opposing theories describe how expectations affect perception, each with its own body of empirical evidence. Bayesian theories argue that predicted stimuli have a greater influence on perception12,13,14, while cancellation theories claim that violations of prediction take priority in forming the percept15,16. We found that low-probability exogenous cues generated larger cueing effects than did high-probability exogenous cues, which is most consistent with the framework of cancellation theories. It is worth emphasizing an important point that pertains to the expectation literature more holistically: we successfully measured a modulation in the accuracy of an orientation discrimination judgment, induced by a task-irrelevant color change that indicated the probability of another irrelevant stimulus flashing briefly in the periphery. In short, these data lend credence to the idea that the brain is constantly learning contingencies—even about stimuli that aren’t directly relevant to the task—and that these contingencies reflexively modulate behavior. That said, the presence of a peripheral cue did provide some task-relevant information. It alerted observers to the fact that the presentation of the target was imminent, and resulted in faster response times relative to cue-absent trials in Experiment 2 (see Supplemental Results, Experiment 2 Cue-Absent Comparisons for details). Such “readiness information” would be available anytime the cue appeared, irrespective of cue validity or cue-onset-probability. So, the temporal information provided by the cue cannot account for our primary finding that cue-onset-probability modulates the magnitude of the cue-validity effect. But when responses are required to be made under strict time pressure, we concede that peripheral onsets may not be entirely task-irrelevant.

Regarding the primary empirical finding that information about the probability of an exogenous cue’s appearance modulates its behavioral cueing effect, we note a number of strengths. First, we find evidence for the transient nature of the interaction between probability condition and cueing effect in two independent samples. This interaction was present in Experiment 1’s first response delay period, and in Experiment 2, which employed a 500 ms response deadline. At longer response times, as measured independently in Experiment 1 and in Experiments 3, 4, 5 and 6, there was definitively no interaction between the exogenous cue’s probability of appearance and the cue validity effect. While the first sample was comprised of a small group of undergraduate students from an American college, the others were large online samples from many countries around the world. Another strength concerns the diverse environments in which these experiments were conducted. In Experiment 1, participants completed the task in a quiet, darkened experimental testing room with no extraneous visual stimuli. In the others, participants were required to complete the study on a desktop computer, but we had no way to control any other aspect of the testing environment. Any number of visual distractions were likely present during the online studies, but this transient interaction between cue probability and exogenous attention was observed again (Experiment 2).

It is also worth noting that we replicated the classic finding in the covert exogenous spatial attention literature (i.e., peripheral cues improve task performance when they appear near a subsequently presented visual target) in all of our experiments. Lastly, we showed that our results were robust to very different statistical approaches: (1) Two stalwarts of the frequentist toolkit (paired t-tests and repeated-measures ANOVAs) showed that we can be confident in rejecting the hypothesis that the spatial attention effect engendered by a high-probability cue is equivalent in magnitude to that engendered by a low-probability cue. (2) A Bayesian signal-detection theoretic model showed that we can be confident in believing that the change in detection sensitivity engendered by a low-probability cue exceeds that of a high-probability cue.

There are also a number of important limitations to our study. In both the lab-based SAT (Experiment 1) and the online experiment with a 500 ms response deadline (Experiment 2), only ~ 30% of the participants were able to achieve the criteria needed to qualify for additional sessions: specifically, to respond quickly enough. For future studies, one potential remedy is to add additional practice sessions, so that participants might learn to respond in a constrained temporal window. The drawbacks of additional practice sessions include the costs of time and money.

Evidence for a transient interaction between cue probability condition and cue validity passed muster with respect to the statistical norms of our field and replicated in an independent sample, but we concede that it is not a large effect. All 7 participants in the lab-based SAT study showed larger cueing effects for low-probability cues at the first response delay (hence the remarkably small 95% bootstrapped error bars in Fig. 2E). That said, the scatterplot for Experiment 2 reveals a considerable amount of individual variability (Fig. 4C). Such variability could be due to measurement noise induced by the differences in task environment discussed above, it could be due to differences in the measurement devices themselves (e.g., different computers, monitors, web browsers), or it could be a genuine reflection of individual differences. Using changes in the accuracy of a simple orientation discrimination judgment to infer that the brain is tracking contingencies between task-irrelevant events is a tall order. So regardless of the true underlying cause, it is worth noting that we did not begin these studies with an expectation for large effects.

Many important questions remain after this preliminary investigation. First, how do prior probabilities about a peripheral cue’s location influence the covert exogenous cueing effect? In our experiments, the context signal indicated the cue’s probability of appearing, but its location was always uncertain. It is possible that removing such spatial uncertainty might assist in the formation of implicit associations, thus strengthening the magnitude of the interaction we observed in the current study. Moreover, the context signal itself was subtle in these experiments. Would a more salient, engaging context signal strengthen the modulation of spatial attention? Future work will also be needed to better understand the temporal dynamics of the learning itself. How quickly are the implicit associations formed? Do the associations extinguish? These are important questions that can all be addressed using well-established psychophysical methods. But given that low-probability events happen infrequently in any one session, sufficiently answering them will require significant amounts of data, from a large number of participants. Crucially, the neural mechanism responsible for the modulatory effect of onset probability on the reflexive allocation of spatial attention is currently unknown. But the methodological innovation developed here will be critical for future neuroimaging work aimed at developing and refining a mechanistic understanding of the behavioral results reported in this study. The results of the six experiments reported here suggest that more exploration is warranted.

Methods

All experiments

Bayesian modeling

We used the ulam function in the rethinking package in R to estimate the posterior distributions for Experiment 2. We ran 4 chains, each with 10,000 iterations. The largest R-hat convergence diagnostic value for any parameter was 1.0004, which indicates successful mixing of the chains. The model itself is available in the data folder linked above, as is a spreadsheet containing detailed information about each posterior distribution in the model (mean, standard deviation, 95% credible interval, number of effective sample, and R-hat). See also Supplemental Results, Prior Predictive Check and Supplemental Results, Posterior Predictive Check (Figs. S2, S3).

Institutional review board approval

The experimental procedures for all experiments were approved by the Institutional Review Board (IRB) at Trinity College, and all methods were performed in accordance with the relevant IRB guidelines and regulations.

Experiment 1

Apparatus

The perceptual task was programmed in PsychoPy29 and run on a 3.0 GHz Dual-Core Intel Core i7 Mac Mini; stimuli were displayed on 27.0" LED-Lit Dell Gaming Monitor (model: S2716DG), with screen resolution of 2560 pixels × 1440 pixels. Participants were seated in a darkened experimental testing room and indicated their response via a Logitech F310 gaming controller.

Participants

Seven participants completed Experiment 1 (age range, 19–21; genders, 3F/4M). The first two participants were co-authors, both of whom completed 21 experimental sessions (17,640 trials) without monetary compensation. At the time of data collection, each knew that the color of the fixation squares was related to the probability of the cue onset, but little else. Neither saw any data until all of their sessions were complete, and all of the online studies were conducted well after each had finished all sessions. Thus, neither of these participants knew anything about the outcome or the pattern of results presented here at the time of data collection.

Fifteen additional participants, all undergraduates at Trinity College, completed a “screening” session for Experiment 1. In this session, participants first completed a task to verify typical color vision. For this task, the two fixation squares were presented in the same color or in different colors. Participants made 40 same/different judgements regarding the colors of the fixations squares, with no time constraints. Each observer then completed practice trials of the main task, followed by 840 experimental trials (Fig. 1). To be eligible for additional sessions, participants were required to provide correct responses on at least 70% of the trials in the color task (28/40) and in the SAT task (588/840). Attesting to the difficulty of responding correctly (or at all) during the 500-ms response window, only five participants met these inclusion criteria. All participants received $15 for completing the screening session.

The five participants who qualified all completed 17 additional sessions, each of which contained 840 experimental trials (15,120 total trials across the 18 sessions). For one of these participants, a sporadic, undiagnosed computer issue abruptly ended the session on 4 occasions. The data from these sessions, up to the final response, are included (16,408 total trials). Participants received $15 for completing each session, plus a $50 completion bonus after the final session (total compensation, $320, or $380 for the participant with 4 abruptly ended sessions). The experimental procedures were approved by the Institutional Review Board at Trinity College, and all participants provided written informed consent. For methodological details, see Supplemental Methods, Experiment 1 Minutiae.

Instructional video

With the exception of the two co-authors, participants watched the instructional video available with the data and analysis scripts.

Online experiments

Instructions to participants

Instructions were delivered via detailed slides that participants viewed at their own pace. They could move forward and backward, reviewing previous slides if they wished. All slides are available with the data and analysis scripts.

Preregistered analysis plans

The analysis plans for some of the online experiments were preregistered on the Open Science Framework. However, as we better understood the nuances and peculiarities of this particular protocol, we made some changes to the analysis approach. To add confidence to our conclusions, we ran Experiment 2 to replicate the key finding from Experiment 1 in an independent sample.

Previous participation

For all online experiments, participants who completed the screening session of one version became ineligible for any future versions of the study.

Experiment 2

Participants

Data for Experiment 2 was collected in four separate rounds on Prolific (20, 30, 50, and 220 submissions). Thus, a total of 320 participants were recruited and completed Session 1. Of these, five submissions contained no data. One additional participant had a datafile, but simply didn't respond on any of the trials.

Of the remaining 314 participants, 96 participants (30.57%) met our inclusion criteria for two additional sessions: (1) the number of correct responses must exceed 60% (trials in which the deadline was missed were excluded), and (2) the number of missed responses must be less than 10%. Of these 96 participants (ages 18–48, mean = 26.98; gender, 29F/67M), 92 completed session 2, and 91 completed session 3. The experimental procedures were approved by the Institutional Review Board at Trinity College, and all participants provided informed consent by pressing “c” on the keyboard to continue past the consent screen. For methodological details, see Supplemental Methods, Experiment 2 Minutiae.

Dynamic tilt update error

Due to a coding error, the mechanism designed to dynamically update the magnitude of the line tilt (as in Experiment 1, see Supplemental Methods, Experiment 1 Minutiae) did not work as intended: from trial 57 onwards in each session, both lines were always randomly and independently rotated ± 20° from vertical.

Final session questions

We did not begin including the end-of-experiment questions until the third round of data collection for Experiment 2. This is why there is data from 80 participants and not 91.

Compensation

Participants received a flat rate of $10.00 for completing each approximately 1-h session.

Computing dʹ

To calculate dʹ from the trial-level data, hits were defined as trials in which a CCW target was present and the participant reported CCW. False alarms were defined as trials in which a CW target was present and the participant reported CCW. To calculate dʹ, we subtracted the inverse of the standard normal probability distribution function for the proportion of false alarms (FA) from the inverse of the standard normal probability distribution for the proportion of hits (H), such that: dʹ = Φ−1(H) – Φ−1(FA).

Simulating data for the prior and posterior predictive checks (see Supplemental Results, Prior Predictive Check and Supplemental Results, Posterior Predictive Check) would sometimes result in hit rates equal to 1 or false alarm rates equal to zero, which yields a dʹ value of ∞ or −∞. As is typical, we introduced “half an error” when either occurred (e.g., with 100 signal-present trials, 100 hits became 99.5 hits; with 100 signal-absent trials, 0 false alarms became 0.5 false alarms).

Experiment 3

Notable changes from Experiment 2

(1) After the offset of the line stimuli, the fixation circle turned white and participants were required to enter a response within 700 ms. (2) A dynamic tilt updating procedure was successfully implemented, with proportion correct used to update the line orientation after every 56 trials with the following formula: \(Til{t}_{new}=Til{t}_{current}-((p-0.75)*Til{t}_{current})\), where \(p\) refers to proportion correct in the previous 56 trials. The maximum allowable tilt was 20°.

Participants

Data for Experiment 3 was collected in two separate rounds on Prolific (51 and 201 approved submissions). Of these, ten datasets were incomplete or missing all data. Of the remaining 242 full datasets, 132 participants (ages 19–47, mean = 25.89; gender, 44F, 87M, 1 prefer not to say) met our inclusion criteria for additional sessions, which was the same criteria as Experiment 2. Twenty-eight of these came from the first Prolific round, which allowed for two additional sessions; 27 participants completed one additional session, and 26 participants completed both additional sessions. One hundred four of these came from the second Prolific round, which allowed for only one additional session; 97 attempted to complete the additional session, but computer issues resulted in data not being saved for five of them, leading to a total of 92 session 2 datasets in this round. The experimental procedures were approved by the Institutional Review Board at Trinity College, and all participants provided informed consent by pressing “c” on the keyboard to continue past the consent screen.

Final session questions

We have data for all participants who completed the final session. This was session 3 in the first round of data collection (N = 26) and session 2 in the second round of data collection (N = 92), for a total of 118 participants.

Compensation

Participants received a flat rate of $10.00 for completing each approximately 1-h session.

Experiment 4

Notable changes from Experiment 2

(1) The duration of the “context signal” decreased from 1000 to 500 ms. (2) After the offset of the line stimuli, the fixation circle maintained its context signal color (red or green), and participants had up to 20 s to input a response. (3) Visual feedback consisted of a white fixation circle for correct responses, and an “X” for incorrect or missed deadline responses. (4) No ITI was included in the trial sequence. (5) The mechanism that was meant to dynamically update the magnitude of the line tilt worked as intended, matching that of Experiment 3.

Participants

50 participants were recruited via Prolific and completed Session 1. Of these, one dataset contained no data; messages with the participant indicated a computer issue. 49 datasets were analyzed. 45 participants provided a correct response on more than 60% of the trials and were invited for two additional sessions (ages 19–46, mean = 26.2; genders, 22F/23 M). All 45 qualified participants have 3 sessions. The experimental procedures were approved by the Institutional Review Board at Trinity College, and all participants provided informed consent by pressing “c” on the keyboard to continue past the consent screen.

Compensation

Participants received a flat rate of $10.00 for completing each approximately 1-h session.

Final session questions

We did not include these questions for anyone in Experiment 4.

Experiment 5

Notable changes from Experiment 4

Experiment 5 used the same segments and timing as Experiment 4, with the following exceptions. (1) We introduced a 1067 ms “response delay” period. During this time, the fixation circle (still red or green) appeared at the center of a gray screen. The response cue was presented, indicating which line was the target. Participants were instructed that they have to wait until the fixation circle turns white to make their response. (2) The response window segment was open for 1000 ms or until response (whichever was shorter). The fixation circle changed to white to indicate the start of the response window. The response cue indicating which line was the target remained onscreen. (3) Experiment 5 used the same emoji-based feedback system from Experiments 2 and 3. (4) The final session questionnaires were reinstated at the end of session 3.

Participants

50 participants were recruited via Prolific and completed Session 1. Of these, one dataset contained no data; messages with the participant indicated a computer issue. 49 datasets were analyzed. 47 participants provided a correct response on more than 60% of the trials and were invited for two additional sessions. Of these 47 participants (ages 19–49, mean = 25.68; gender, 25F/22M), 46 completed session 2, and 45 completed session 3. The experimental procedures were approved by the Institutional Review Board at Trinity College, and all participants provided informed consent by pressing “c” on the keyboard to continue past the consent screen.

Compensation

Participants received a flat rate of $10.00 for completing each approximately 1-h session.

Experiment 6

Notable changes from Experiment 5

Before launching Experiment 5, we discussed whether or not we should “enforce” the response delay by listening for key presses during the delay period and stopping the trial if one occurred. We decided to listen for key presses, but not interfere. What we discovered was that key presses occurred during the delay period on more than 30% of the trials in Experiment 5. Therefore, in Experiment 6, we enforced the response delay in the following way: If either response button was pressed during the delay period, the trial ended and participants were shown a yellow palm emoji for 1500 ms with text that read: “Please wait until the fixation circle turns white before responding”. Apart from this change, Experiment 6 used the same segments and timing as Experiment 5.

Participants

50 participants were recruited via Prolific and completed Session 1. Of these, one dataset contained no data; messages with the participant indicated a computer issue. 49 datasets were analyzed. To be invited for two additional sessions, accuracy must exceed 60% correct and the number of early responses must be less than 10%. 45 participants met these criteria. Of these 45 participants (ages 18–45, mean = 24.6; gender, 17F/28M), 43 completed session 2, and 42 completed session 3. The experimental procedures were approved by the Institutional Review Board at Trinity College, and all participants provided informed consent by pressing “c” on the keyboard to continue past the consent screen.

Final session questions

41 participants completed the final session questions. One of the participants messaged to say that they hit “esc” before completing Session 3. They tried to start again but didn’t finish an entire new session. The two datafiles together have more than 960 trials, so all trials are included and labeled as session 3 in the datafile. That said, there are no end-of-session questions for this participant.

Compensation

Participants received a flat rate of $10.00 for completing each approximately 1-h session.