Introduction

Firing rate adaptation is a well-established feature of sensory neurons, involving short-term reduction in spiking activity and changes in sensitivity to the features of a repeatedly presented stimulus1,2. This process is an important resource to achieve code efficiency in the brain as it ensures heightened sensitivity to unexpected or unusual stimuli3,4. However, adaptation also leads to visual illusions such as the motion after-effect (MAE): the perception of motion in a stationary image following prolonged exposure to a moving stimulus5. This phenomenon is also known as the waterfall effect, coined after its notable observation in a natural setting.

In neuroscience research, visual adaptation has been documented through psychophysical experiments6,7 and electrophysiological recordings8,9. In functional imaging, adaptation serves as a signature of region-specific encoding, with regions responsive to a specific feature displaying reduced activity10,11,12,13. Conversely, historical experiments searched for a neural correlate of the MAE by examining increased activity in non-adapted neuron populations14,15. Nevertheless, isolating the effects of adaptation is challenging, since functional magnetic resonance imaging (fMRI) shows the activity of entire voxels, likely containing diverse neuron populations coding for different features.

Adaptation is believed to play an important role in the perception of ambiguous stimuli by driving switches between the various available interpretations16,17. A notable bistable stimulus is the plaid stimulus, consisting of inward-moving gratings that can be perceived either as a plaid moving coherently downward or as two gratings moving incoherently through each other (Fig. 1). In an fMRI study from our group, Sousa et al.18 manipulated plaid stimuli to induce either coherent or incoherent perception, observing distinct levels of brain activation in the middle temporal (MT) visual area for each type of perception (Fig. 2). Specifically, they noted that the initial response to coherent motion was weaker compared to the response to incoherent motion. There are two potential explanations for the underlying neural mechanisms (Fig. 3): a weaker coherent response may result from stronger adaptation-mediated reduction during coherent versus incoherent motion, or a stronger incoherent response could stem from the involvement of more neural populations to represent motion in more directions. While Sousa et al.18 also demonstrate a time-dependent change in the relationship between responses to different conditions, this falls outside the scope of the present study.

Fig. 1
figure 1

Bistable plaid stimulus: two superimposed gratings moving in opposite directions can be perceived as either one plaid moving downward (coherent motion) or as two gratings moving past each other (incoherent motion).

Fig. 2
figure 2

Experimental blood-oxygen-level-dependent (BOLD) signal measured in our previous study (Sousa et al.18) in area MT in response to disambiguated plaid stimuli. Each curve corresponds to the average response across twenty participants to each type of condition: 30-second adaptation to coherent motion, 30-second adaptation to incoherent motion and 30-second visualization of non-adapting motion (see Methods for details). After exposure to the moving stimuli, a static plaid was displayed for 12 s. Shaded area around curves represents standard error of the mean.

Fig. 3
figure 3

Two hypotheses to explain lower response to coherent motion. The first hypothesis proposes that incoherent and coherent stimuli elicit the same response, but the coherent representation undergoes a more pronounced reduction during prolonged exposure through adaptation. The second complementary hypothesis proposes that the coherent stimulus elicits a smaller response because it recruits a smaller neural representation, due to motion in only one direction.

Here, we aim to explain the distinct response levels reported by Sousa et al. (Fig. 2) by employing a firing rate model and the Balloon-Windkessel model to simulate the activity and resulting blood-oxygen-level-dependent (BOLD) signal of a network of thirty-two neurons, each tuned to one direction of motion. By varying the levels of neuronal adaptation, we determine that strong adaptation is necessary to replicate the experimental curves, both during and after stimulus presentation. However, increasing neuronal adaptation does not affect the relative order of the levels of the coherent and incoherent response curves, suggesting that differences in neuronal recruitment contribute to the lower response to coherent motion.

Methods

To simulate the activity of area MT, we used a firing rate model with 32 neuronal units (Fig. 4). Each unit represents a neuron or a population of identical neurons tuned to motion in one direction. The dynamics of each neuron is described by two differential equations: one for the evolution of the firing rate as a function of the synaptic current and one for the accumulation of adaptation as a function of the firing rate. Firing rate activity is then transformed into a BOLD signal using the Balloon-Windkessel model19.

Fig. 4
figure 4

Visual representation of the simulated model of 32 neurons tuned to 32 directions of motion. The receptive field for the MT neuronal unit tuned to motion in the top right diagonal direction (45 degrees) is shown on the right (see Eq. 1 below).

Receptive field

Each neuron responds maximally to a preferred direction of motion according to the following receptive field6,20:

$$\:{\text{S}}_{\text{i}}={\text{c}}_{\text{j}}{\text{e}}^{\text{k}\left(\text{cos}\left({{\uptheta\:}}_{\text{j}}-{{\uptheta\:}}_{\text{i}}\right)-1\right)},$$
(1)

where \(\:{\text{S}}_{\text{i}}\) is the response of neuron \(\:\text{i}\), with a preferred direction \(\:{{\uptheta\:}}_{\text{i}}\), to a stimulus moving with intensity \(\:{\text{c}}_{\text{j}}\) in the direction \(\:{{\uptheta\:}}_{\text{j}}\), and \(\:\text{k}\) is a measure of the bandwidth of the receptive field, with higher \(\:\text{k}\) corresponding to narrower tuning. For 32 neurons with equidistant preferred directions, \(\:{{\uptheta\:}}_{\text{i}}=0^\circ\:,11.25^\circ\:,12.5^\circ\:,23.75^\circ\:,35^\circ\:,\dots\:\) (Fig. 4).

Differential equations

The synaptic current \(\:\text{I}\) entering each neuron is:

$$\:\text{I}={\text{S}}_{\text{i}}+{\text{I}}_{\text{b}}\:,$$
(2)

with Si the sensory input, given by Eq. (1), and Ib the baseline activity. The synaptic process is assumed to be quasi-instantaneous compared to the firing rate process21, which is described by:

$$\:{\uptau\:}\frac{d\text{F}}{dt}=-\text{F}+\frac{{\left[\text{I}\right]}_{+}^{2}}{{\text{s}}^{2}+{\left[\text{I}\right]}_{+}^{2}}\:,$$
(3)

where \(\:\text{F}\) is the firing rate, \(\:{\uptau\:}\) is the time constant of the low-pass filtering process defined by the equation, \(\:\text{s}\) is the saturation constant of the non-linear activation function defined by the ratio on the right-hand side and \(\:{\left[\dots\:\right]}_{+}\) denotes half-wave rectification.

Firing rate adaptation is modeled as an exponential process2,22:

$$\:{{\uptau\:}}_{\text{A}}\frac{d\text{A}}{dt}=-\text{A}+\text{F}\:,$$
(4)

where A is the adaptation level of a given neuron, which accumulates with the firing rate F and decays with time constant τA. Adaptation acts divisively on the activation function of each neuron, thereby modifying Eq. (3) to:

$$\:{\uptau\:}\frac{d\text{F}}{dt}=-\text{F}+\frac{{\left[\text{I}\right]}_{+}^{2}}{{\text{s}}^{2}+{\text{w}}_{\text{A}}\text{A}+{\left[\text{I}\right]}_{+}^{2}}\:,$$
(5)

with \(\:{\text{w}}_{A}\) the adaptation strength. The activity of each neuron is thus calculated by integrating two differential equations, Eqs. (4) and (5).

BOLD signal

Neuronal activity gives rise to the blood-oxygen-level-dependent signal detected in fMRI through a hemodynamic process that is well-described by the Balloon-Windkessel model19, comprised of four dynamical variables: vasodilatory signal \(\:\text{s}\left(\text{t}\right)\), blood inflow \(\:\text{f}\left(\text{t}\right)\), blood volume \(\:\text{v}\left(\text{t}\right)\) and deoxyhemoglobin content \(\:\text{q}\left(\text{t}\right)\). The system of differential equations is:

$$\:\left\{\begin{array}{c}\frac{d\text{s}}{dt}=-\kappa\:s-\gamma\:\left(\text{f}-1\right)+z\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\\\:\frac{d\text{f}}{dt}=s\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\\\:{{\uptau\:}}_{\text{B}}\frac{d\text{v}}{dt}=-{\text{v}}^{1/{\upalpha\:}}+f\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\\\:{{\uptau\:}}_{\text{B}}\frac{d\text{q}}{dt}=-{\text{v}}^{1/{\upalpha\:}-1}q+\left(1-{\left(1-{\uprho\:}\right)}^{1/\text{f}}\right)f/\rho\:\:\:\:\:\:\:\:\:\end{array}\right.$$

where \(\:\text{z}\left(\text{t}\right)\) is the neuronal activity, given by the sum of the neuronal firing rates calculated through Eq. (5), and \(\:{\upkappa\:}\), \(\:{\upgamma\:}\), \(\:{{\uptau\:}}_{\text{B}}\), \(\:{\upalpha\:}\) and \(\:{\uprho\:}\) are parameters describing the rate of signal decay, rate of flow-dependent elimination, hemodynamic transit time, Grubb’s exponent of blood outflow and resting oxygen extraction fraction, respectively. The BOLD signal, \(\:\text{B}\left(\text{t}\right)\), is then a volume-weighted sum of intra- and extravascular contributions to blood volume and deoxyhemoglobin content:

$$\:\text{B}={\text{V}}_{0}\left(7{\uprho\:}\left(1-\text{q}\right)+2\left(1-\text{q}/\text{v}\right)+\left(2{\uprho\:}-0.2\right)\left(1-\text{v}\right)\right)$$

with \(\:{V}_{0}\) the resting blood volume fraction.

Stimuli and experimental parameters

The bistable plaid stimulus is composed of two inward-moving gratings with two possible interpretations: a plaid moving downward (coherent percept) or two gratings moving through each other (incoherent percept) (Fig. 1). In our previous study18, Sousa et al. added a field of moving dots to the bistable plaid stimulus to induce the disambiguated perception of either coherent or incoherent motion. A non-adapting control stimulus was used, consisting of alternating coherent and incoherent motion in eight different directions, each with a duration of 1.5 s. After 6 s of a static plaid display, the manipulated moving stimulus was presented for 30 s, followed by 12 s of a static plaid display. The gratings make a 60\(\:^\circ\:\) static angle with the horizontal and move in the horizontal direction (left grating moves to the right, 0\(\:^\circ\:\), and right grating moves to the left, 180\(\:^\circ\:\)).

Besides the oriented lines that comprise each grating, the background rhombi between the lines constitute an extra component of the stimulus. As such, the stimuli have motion energy in a maximum of three directions. The coherent stimulus is dominated by downward motion, while the incoherent stimulus includes motion in all three directions (see Supplementary Videos). Assuming the same total intensity for both coherent and incoherent conditions, the coherent stimulus may be defined as having intensity 1 in the downward direction (270\(\:^\circ\:\)) and 0 in all other directions, while the incoherent stimulus can be defined as having intensity \(\:1/3\) in the 0\(\:^\circ\:\), 270\(\:^\circ\:\) and 180\(\:^\circ\:\) directions, and 0 in all other directions. These can be written as \(\:{s}_{c}=\left(\text{0,1},0\right)\) for the coherent stimulus and \(\:{s}_{i}=\left(1/3,1/3,1/3\right)\) for the incoherent stimulus. The parametrization of the stimulus constitutes one of the main aspects of the model. These parameter values are summarized in Table 1.

Table 1 Stimulus parameters.

Simulation details

The neuronal circuit was defined in PyRates23,24, where the differential equations were solved via scipy’s Runge-Kutta(4,5) integration scheme25. Parameter values are presented in Tables 24. To match the experimental trial structure18, the activity of the circuit was simulated for a total of 48 s: 6 s of initial static stimulus followed by 30 s of motion stimulus and 12 s of static stimulus. the initial static period was used to calculate the mean baseline BOLD signal and is not shown in the plots. Mean baseline signal was subtracted from the calculated BOLD response and used as a normalization factor to yield the BOLD signal variation presented in Figs. 5 and 6.

Table 2 Neuronal circuit parameters.
Table 3 Hemodynamic parameters.
Table 4 Simulation parameters.

Results

To elucidate the neuronal mechanisms probed by our fMRI experiment (Sousa et al.18), we simulate the activity of thirty-two neuron populations, each with a preferred motion direction and an independent adaptation process. The total activity of the system is obtained by summing the firing rate of all units and the corresponding BOLD signal is calculated through the Balloon-Windkessel equations. We vary the strength of neuronal adaptation, \(\:{\text{w}}_{A}\), and the intensity of the coherent and incoherent stimuli in each of the relevant directions (\(\:0^\circ\:\), \(\:270^\circ\:\), \(\:180^\circ\:\)) to test the two hypotheses: stronger adaptation-mediated reduction during coherent motion vs. weaker neuronal recruitment in response to coherent stimuli. The first hypothesis predicts that coherent response should be lower than incoherent response only when adaptation is introduced. The second hypothesis predicts that coherent response should be lower than incoherent response regardless of adaptation levels.

Adaptation is required to replicate our experimental order of response curves during stimulus presentation

Fig. 5
figure 5

Effect of neuronal adaptation strength on the simulated BOLD signal, for each of the experimental conditions (Adapt Coherent, Adapt Incoherent, Non-adapting). Stimulus parameters were \(\:{s}_{c}=\left(\text{0,1},0\right)\) for coherent motion and \(\:{s}_{i}=\left(1/6,2/3,1/6\right)\) for incoherent motion. The motion stimulus starts at \(\:t=0\) s and stops at \(\:t=30\) s.

By varying the adaptation strength parameter (Fig. 5), we observe that increasing neuronal adaptation reduces the activity for all conditions during stimulus presentation (0 to 30 s) and inverts the order of the incoherent and non-adapting responses, yielding the observed experimental order of the curves: coherent < incoherent < non-adapting.

Importantly, the relative order of the coherent and incoherent curves is conserved as neuronal adaptation strength increases, suggesting that the lower response to coherent motion stems from the mechanism of lower neuronal recruitment, which is not overcome by increasing neuronal adaptation.

The model with adaptation also replicates our experimental order of response curves during the motion after-affect

Fig. 6
figure 6

Effect of neuronal adaptation strength on the simulated BOLD signal after motion stimulus presentation, for each of the experimental conditions (Adapt Coherent, Adapt Incoherent, Non-adapting). Stimulus parameters were \(\:{s}_{c}=\left(\text{0,1},0\right)\) for coherent motion and \(\:{s}_{i}=\left(1/6,2/3,1/6\right)\) for incoherent motion.

After the stimulus is turned off at 30 s, the order of the curves is reversed relative to stimulus presentation, with the coherent condition eliciting the strongest response (Fig. 6). This agrees with the experimental BOLD signal, as well as the observation of a more vivid MAE following the coherent stimulus, compared to the incoherent condition18.

Coherent response is lower than incoherent response across stimulus parametrization and adaptation strength

Fig. 7
figure 7

Effect of adaptation strength and different stimulus parametrizations on the relative distance between the curves during stimulus presentation. The metrics were obtained by averaging the point-by-point difference between curves from 6 to 30 s. Marked with * is the simulation presented in Figs. 5 and 6. (A) Positive values (green) denote simulations where the maximum activity is obtained in response to the non-adapting condition. (B) Positive values (green) correspond to simulations where the response to the coherent condition is weaker than to the incoherent condition. Positive values (green) on both heatmaps represent simulations that replicate the experimental order of the curves.

The definition of the stimulus through the input in each of the three relevant directions (\(\:0^\circ\:\), \(\:270^\circ\:\), \(\:180^\circ\:\)) is one of the main degrees of freedom of the model. By varying the parametrization of the stimulus (Fig. 7), we can investigate whether it is general that neuronal adaptation strength does not affect the relative order of the coherent and incoherent curves, as found in Fig. 5.

The first five parametrizations assume that the total stimulus intensity is the same for both coherent and incoherent stimuli and is equal to 1, with variations in the relative weights of the lateral and down directions in the incoherent stimulus. The next three parametrizations assume equal weight of the three directions in the incoherent stimulus but scale down the total intensity in both stimuli. The last three parameter choices do not equalize total intensity between the two stimuli, with the last parametrization chosen so that the coherent stimulus is stronger than the incoherent one.

In general, coherent response is lower than incoherent response regardless of adaptation strength, with three exceptions. When stimulus parameters are chosen with a large motion energy imbalance so that coherent response is stronger than incoherent response in the absence of adaptation (last column: \(\:{s}_{c}=\) (0, 1, 0) and \(\:{s}_{i}=\) (0.2, 0, 0.2)), introducing adaptation is not enough to invert the order of the coherent and incoherent activations. Similarly, when the total intensity of the stimuli is equal but scaled down to 0.25 (middle column: \(\:{s}_{c}=\) (0, 1/4, 0) and \(\:{s}_{i}=\) (1/12, 1/12, 1/12)), coherent response is larger than incoherent response regardless of adaptation strength. Finally, when total intensity is scaled down to 0.5 (\(\:{s}_{c}=\) (0, 1/2, 0) and \(\:{s}_{i}=\) (1/6, 1/6, 1/6)), coherent response is approximately equal to incoherent response in the absence of adaptation and becomes lower with increasing adaptation. This is the only case where adaptation strength alone modulates the response levels in a way that is congruent with the experimental results and with the first hypothesis. However, this constitutes a very specific stimulus parametrization, chosen so that total stimulus intensity is 0.5, which coincides with the inflection point of the neuronal non-linear response function (Eq. 3). It is thus unlikely that this finely-tuned parametrization is biologically representative. Therefore, for all plausible stimulus parameters, neuronal adaptation does not modulate the relative order of the coherent and incoherent response levels.

Discussion

Visual perception of bistable stimuli has long puzzled scientists and while several mechanisms have been proposed to explain how different interpretations coexist, specific mechanisms have remained elusive. In this work, we focused on the results of our neuroimaging experiments of a bistable plaid stimulus and employed a computational model to study the neural substrate of coherent and incoherent motion perception. Our simulations of motion-sensitive neurons allowed us to reveal the contributions of evoked neural response and adaptation, and how they interact with each other to explain neural responses at a columnar level. Particularly, we tested two alternative hypotheses to explain the lower response to adapting coherent stimuli.

The first hypothesis proposes that coherent stimuli elicit a neural response that matches or surpasses the response to incoherent stimuli in the absence of habituation, but it is adaptation, or repetition suppression, that subsequently diminishes coherent response below incoherent response. This pattern would be evident in simulations with and without neuronal adaptation: in the absence of adaptation, coherent response would be equal to or stronger than incoherent response, whereas with adaptation, coherent response would be lower than incoherent response. Conversely, the second hypothesis states that coherent stimuli elicit a neural response lower than the response to incoherent stimuli, because coherent motion only stimulates neurons tuned to one motion direction, while incoherent motion stimulates at least two neural populations tuned to opposite directions of motion31. Simulations would thus show lower coherent response regardless of adaptation levels.

Our results show that only with adaptation can the experimental order of the response curves be replicated, both during stimulus presentation (Fig. 5) and during the motion after-effect (Fig. 6). However, the level of response to coherent motion is generally lower than the response to incoherent motion, irrespective of adaptation strength (Fig. 7), suggesting that differential neuron population recruitment is responsible for the relative order of the coherent and incoherent activations. Hence, the hemodynamic response observed by Sousa et al.18 arises due to a combination of neuronal adaptation and differences in columnar recruitment.

Our simulations exhibit an important signature of the motion after-effect that occurs after adaptation to a moving stimulus. When the stimulus is turned off, the activity after coherent and incoherent motion is larger than the activity after the non-adapting condition, with adaptation to coherent motion eliciting a stronger after-effect than incoherent motion. It is significant that we replicate the MAE and its neuroimaging signal with our parsimonious model, containing only non-interacting neurons that undergo adaptation. While the perception of the MAE relies on a higher-level mechanism that detects imbalanced responses across different neuronal populations, our model focuses only on the BOLD signal.

A noteworthy limitation of our model is its inability to replicate the second half of the BOLD signal recorded by Sousa et al., where the adapting responses increase beyond the non-adapting response (Fig. 2). This suggests that other computational motifs may be involved, namely network interactions through inhibition or adapted inhibition32.

In conclusion, our computational model of motion-sensitive adapting neurons replicates important features of the neural responses to coherent, incoherent, and non-adapting motion observed in an fMRI experiment by Sousa et al.18, but only when adaptation is present. Furthermore, we tested two competing hypotheses regarding the mechanisms involved in coherent and incoherent motion perception: stronger adaptation to coherent motion vs. stronger neuronal recruitment by incoherent stimuli. By simulating the response of the network to stimuli with various parametrizations and in different adaptation regimes, we determined that the second hypothesis is more likely. Finally, our results also reproduce the neural activity after stimulus exposure, which is congruent with the observed motion after-effect. In sum, this computational work establishes the involvement of both adaptation and differential neuronal recruitment as pivotal mechanisms in the hemodynamic response to the plaid stimulus, enriching our understanding of bistable motion perception.