Abstract
How does the brain integrate complex and dynamic visual inputs into phenomenologically seamless percepts? Previous results demonstrate that when visual inputs are organized coherently across space and time, they are more strongly encoded in feedback-related alpha rhythms, and less strongly in feedforward-related gamma rhythms. Here, we tested whether this representational shift from feedforward to feedback rhythms is linked to the phenomenological experience of coherence. In an Electroencephalography (EEG) study, we manipulated the degree of spatiotemporal coherence by presenting two segments from the same video across visual hemifields, either synchronously or asynchronously (with a delay between segments). We asked participants whether they perceived the stimulus as coherent or incoherent. When stimuli were presented at the perceptual threshold (i.e., when the same stimulus was judged as coherent 50% of times), perception co-varied with stimulus coding across alpha and gamma rhythms: When stimuli were perceived as coherent, they were represented in alpha activity; when stimuli were perceived as incoherent, they were represented in gamma activity. Whether the same visual input is perceived as coherent or incoherent thus depends on representational shifts between feedback-related alpha and feedforward-related gamma rhythms.
Similar content being viewed by others
Introduction
The visual inputs we receive in real life consist of vast arrays of features scattered across space and dynamically evolving through time. Yet, we phenomenologically experience the world in a spatiotemporally seamless manner. How does the brain integrate the complex and ever-changing inputs across space and time?
Information redundancy in natural inputs may play a critical role: Inputs are redundant across space, with predictable arrangements of both low-level features1 and high-level object content2,3. They are also redundant across time, with events unfolding in highly predictable sequences4,5. These redundancies enable the brain to efficiently predict how visual features need to be integrated across space and time.
Such predictions are carried by cortical feedback flows expressed in dedicated rhythmic channels6,7,8. In our previous study9, we investigated integrative processing by manipulating the spatiotemporal coherence of naturalistic videos, where either the same video or different videos were shown through two apertures in the left and right visual hemifields. Decoding analyses on frequency-resolved Electroencephalography (EEG) data demonstrate that when natural inputs align across the visual field (and thus can be integrated), there was stimulus information in feedback-related alpha rhythms, whereas when the inputs do not match (and thus cannot be readily integrated), there was stimulus information in gamma oscillations. Analytically combining the EEG data with functional magnetic resonance imaging (fMRI) data recorded during the same paradigm, we further showed that integration-related alpha dynamics are linked to representations in the early visual cortex, suggesting that integration is mediated by alpha-rhythmic feedback that traverses the hierarchy back to the early visual cortex.
If rhythmic feedback was indeed critical for visual integration, the degree of feedback should co-vary with the phenomenological experience of a coherent visual world: When a visual input is perceived as coherent, it should be represented more strongly in feedback-related alpha rhythms, while inputs perceived as incoherent should be represented more strongly in feedforward-related gamma rhythms.
Here, we put this prediction to the test. In an EEG study, we presented natural video segments across the two visual hemifields, either synchronously or asynchronously (with one segment relatively delayed in time), and asked participants to report whether they perceived the stimulation as spatiotemporally coherent or not. Critically, this paradigm allowed us to test whether asynchronously presented videos at the perceptual threshold (i.e., stimuli perceived as coherent 50% of time) are coded differently across alpha and gamma rhythms, depending on the perceptual report.
Results
We manipulated the degree of spatiotemporal coherence by presenting two segments from the same video synchronously or asynchronously (Fig. 1A) through two square apertures left and right of the central fixation (Fig. 1B). Participants were tasked with reporting whether the whole video display appeared as coherent or incoherent to them. When temporal stimulus asynchrony was low, the videos should appear as coherent (i.e., stemming from one seamless movie), but with higher temporal asynchrony they should appear as incoherent (i.e., with noticeable offset). To investigate differences in neural processing at the perceptual threshold, we initially quantified the delays that led to coherent and incoherent perception with equal probability in a behavioral experiment (Fig. 1C).
A Five natural videos were presented through apertures left and right of fixation, either synchronously or asynchronously. B Participants (n = 26) were instructed to fixate centrally and judge whether the stimulation was perceptually coherent or incoherent. C In an initial behavioral experiment, we used adaptive staircases to determine participants’ threshold delay for each of the five videos. In the subsequent Electroencephalography (EEG) experiment, we presented the videos with no delay (coherent), the staircased delay (threshold), and twice the staircased delay (incoherent). We further separated the threshold trials into coherent and incoherent trials based on participants’ responses. We then decoded between the 5 videos within each condition using spectral power patterns in the alpha and gamma bands. D In the coherent and incoherent conditions, we found that coherent stimuli were decodable from alpha activity, suggesting prominent feedback propagation, whereas incoherent stimuli were decodable from gamma activity, suggesting dominant feedforward propagation. E The threshold condition replicated these results, showing that the representational balance across alpha and gamma rhythms tracks perceived coherence for identical visual inputs. Error bars represent standard errors. Dots represent individual participants. *P < 0.05, +P = 0.064.
In the subsequent EEG experiment, we presented the video stimuli in three conditions: no delay between the two video segments (coherent; 25% of trials), the delay at each participant’s subjective threshold (threshold; 50% of trials), and twice this subjective threshold (incoherent; 25% of trials). We subsequently split the threshold trials into threshold-coherent and threshold-incoherent trials, based on participants’ responses, allowing us to quantify neural representations for the same stimulus when participants perceived it as coherent or incoherent.
We hypothesized that the videos were coded more strongly in feedback-related alpha when perceived as coherent and more strongly in feedforward-related gamma when perceived as incoherent9. To test this prediction, we decoded the stimuli in each condition using spectral power patterns across parietal-occipital (PO) channels, separately for alpha (8–12 Hz) and gamma (31–70 Hz) frequency bands.
We found that stimuli in the coherent condition were decodable from alpha activity [t(25) = 5.409, P < 0.001], whereas stimuli in the incoherent condition were decodable from gamma activity [t(25) = 2.558, P = 0.017]. Coherent stimuli were decoded better than incoherent stimuli in the alpha frequency band [t(25) = 4.257, P < 0.001], and incoherent stimuli were decoded better than coherent stimuli in the gamma frequency band [t(25) = 2.203, P = 0.037; interaction: F(1, 25) = 30.282, P < 0.001] (Fig. 1D).
The threshold condition replicated this pattern of results: When the stimuli at threshold were perceived as coherent, they were decodable in the alpha band [t(25) = 6.415, P < 0.001], and when the stimuli were perceived as incoherent, they were decodable in the gamma band [t(25) = 2.948, P = 0.014]. Coherent stimuli were decoded better than incoherent stimuli in the alpha frequency band [t(25) = 5.385, P < 0.001]. Numerically, incoherent stimuli were decoded better than coherent stimuli in the gamma frequency band, though this difference was not significant [t(25) = 1.938, P = 0.064; interaction: F(1, 25) = 23.592, P < 0.001] (Fig. 1E).
No effects were found in the theta (4–7 Hz) and beta (13–30 Hz) frequency bands (see Supplementary Information Fig. S1) and in evoked broadband responses (see Supplementary Information Fig. S2).
We additionally performed the statistical analyses using permutation tests, which reproduced the overall pattern of results (see Supplementary Information Table S1).
Discussion
Our results show that the representational balance between alpha and gamma rhythms tracks the phenomenological experience of coherence: The same stimulus is coded in feedback-related alpha when it is perceived as coherent but in feedforward-related gamma when it is perceived as incoherent.
Integration-related alpha dynamics may carry spatiotemporally redundant – and thus predictable – stimulus information upstream10, guiding the adaptive integration of this information into meaningful unified percepts. We have previously demonstrated that spatiotemporally coherent stimulation, which readily allows for integration into a coherent percept, is linked to content coding in feedback-related alpha rhythms9,11. Our results show that these alpha-rhythmic codes indeed relate to the phenomenological experience of visual coherence.
Alpha rhythms have been associated with temporal integration before. The duration of the alpha cycle has been linked to the width of temporal integration windows12,13,14, and the phase and power of pre-stimulus alpha rhythms have been linked to integration versus segregation in subsequently presented stimuli13,15. Our findings demonstrate that, beyond that, alpha rhythms also fulfill a function in representing the contents of upstream flows in the cortex, suggesting an active involvement of alpha in binding visual stimuli across time (and space).
Our study probed the concurrent integration across space and time, and perceived incongruencies could originate from incongruent temporal patterns (e.g., motion trajectories) or spatial patterns (e.g., continuation of contours). Whether spatial or temporal properties drive integration to different extents needs to be explored further. Alternatively, integration across space or time may be governed by shared neural mechanisms16.
To conclude, our results suggest that representational shifts from bottom-up gamma to top-down alpha dynamics drive visual integration, highlighting the crucial role of cortical feedback in the construction of seamless perceptual experiences. More broadly, our results provide a rhythmic signature of the feedforward-to-feedback balance in the visual cortex, which can be employed to track subjective changes in perception, attention, or cognition as a function of top-down or bottom-up dominance17.
Methods
Participants
Twenty-six healthy adults (16 females; age = 22.2 ± 2.6 years) with normal or corrected-to-normal vision participated. A minimum sample size of 24 was determined with an effect size of 0.25 as derived from our previous study9, a significance level of 0.05, and a power of 0.8. All participants signed written informed consent and received either course credits or cash reimbursement. The study was approved by the ethical committee of the Department of Education and Psychology at Freie Universität Berlin and was conducted in accordance with the Declaration of Helsinki. All ethical regulations relevant to human research participants were followed.
Stimuli and paradigm
The stimulus set consisted of five short video clips (airplane takeoff, cyclist, roller coaster, ski jumper, and driving car). The videos were presented through two square apertures (6° visual angle) left and right of the central fixation (2.78° offset). The central fixation had a diameter of 0.44° visual angle. Videos were played either synchronously (in each frame, the two images shown through the apertures were from the same frame of the original video) or asynchronously (in each frame, the two images were from different frames of the original video; see Fig. 1A).
We presented stimuli (at 60-Hz refresh rate) and recorded participants’ responses using MATLAB and the Psychophysics Toolbox18,19. We first presented a central fixation for 0.5 seconds, followed by the videos for 3 seconds. Participants were instructed to maintain central fixation and, after the video ended, judge whether the videos were perceptually coherent or incoherent. An example trial is shown in Fig. 1B.
Behavioral experiment
We first conducted a behavioral experiment to estimate subjective integration thresholds for scenes using the QUEST adaptive staircasing procedure20. We ran separate QUEST staircases for each video, initializing the delay between two video segments randomly between 100 and 400 ms in the first trial of each scene and adaptively adjusting the delay afterwards. Each staircase terminated after 80 trials. The staircases converged within this trial count. For each participant, we averaged the delay values in the last 5 trials for each video to obtain the threshold delays.
EEG experiment
In the EEG experiment, we presented stimuli in three conditions. In the coherent condition, stimuli were presented synchronously. In the threshold condition, we set the delay between video segments for each scene to the subjective threshold estimated in the behavioral experiment with the same participant. In the incoherent condition, we set the delay for each scene to twice the subjective threshold. Each coherent/incoherent stimulus was presented 30 times, and each threshold stimulus was shown 60 times, yielding 600 trials, which were presented in random order. For the conditions with a delay, the left segment temporally led in half of the trials, and the right segment led in the other half. After the experiment, we separated threshold trials based on each participant’s responses: if a trial was judged as coherent, we assigned it to the threshold-coherent condition; otherwise, we assigned the trial to the threshold-incoherent condition. On average, 93.4% of coherent trials were perceived as coherent; 94.4% of incoherent trials were perceived as incoherent; 50.6% of threshold trials were perceived as coherent, and 49.4% as incoherent.
We recorded EEG and eye-tracking data while participants conducted the experiment. EEG data were acquired using a 10-10 EASYCAP 64-electrode system with a BrainVision actiCHamp amplifier at 1000 Hz. The data were online filtered at 0.03–100 Hz and referenced to FCz. Eye-tracking data were acquired using the Psychophysics and Eyelink Toolbox extensions21, with an Eyelink 1000 Tower Mount (SR Research Ltd., Canada). We recorded the movements of the right eye and conducted a standard 9-point calibration at the beginning of the experiment.
Eye-movement analysis
We used Fieldtrip22 to epoch the eye-tracking data from −0.5 to 3.5 s and downsampled the data to 200 Hz. To check participants’ fixation stability, we estimated the mean and standard deviation (SD) of horizontal eye movements during stimulus presentation (0–3 seconds). There were no significant between-condition differences in horizontal eye movements, neither in the mean, F(3,75) = 1.295, P = 0.295, nor the standard deviation, F(3,75) = 0.334, P = 0.801.
EEG preprocessing
We preprocessed EEG data using Fieldtrip. We first epoched the data from −0.5 to 3.5 s relative to the stimulus onset. We then band-stop filtered the data to remove 50-Hz line noise, referenced the data to the average of all channels, and downsampled the data to 200 Hz. Next, we visually inspected the data and removed noisy trials (74.7 ± 12.6) and channels (2.4 ± 0.3). The removed channels were subsequently interpolated using neighboring channels. We performed independent component analysis (ICA) using the FastICA algorithm, followed by a visual inspection of the topographical and time-course properties of resulting components, to further remove blinks and eye movement artifacts (1.9 ± 0.1 components). Finally, the data were baseline-corrected by subtracting the mean of pre-stimulus signals.
EEG spectral analysis
We performed spectral analysis on the preprocessed EEG data using Fieldtrip, replicating the analysis pipeline used previously9,11. For each trial, we conducted the fast Fourier transform (FFT) and estimated the power of each frequency from 4 to 70 Hz in each channel. FFT was performed on the whole stimulation period (0–3 s). We used a signal tapper with a Hanning window for the low-frequency bands: theta (4–7 Hz, in steps of 1 Hz), alpha (8–12 Hz, in steps of 1 Hz), and beta (13–30 Hz, in steps of 2 Hz). For the gamma band (31–70 Hz, in steps of 2 Hz), we used the discrete prolate spheroidal sequences (DPSS) multitaper method with ±8 Hz smoothing.
EEG decoding analysis
We performed multivariate decoding analysis to probe rhythmic representations of stimuli using CoSMoMVPA23 and LIBSVM24. For this analysis, we chose 17 parietal and occipital (PO) channels (Pz, P1, P2, P3, P4, P5, P6, P7, P8, POz, PO3, PO4, PO7, PO8, Oz, O1, O2) over visual cortex11,25 and extracted the spectral power patterns across these channels to differentiate between five stimuli within each of the four conditions (coherent, threshold-coherent, threshold-incoherent, and incoherent), separately for each frequency band (theta, alpha, beta, and gamma). The analysis was conducted using the linear support machine (SVM) and leave-one-trial-out cross-validation. The number of trials was always balanced across scenes. Additionally, to reduce the dimensionality of the data, we performed PCA on the training data and then projected the PCA solutions (99% variance explained of the training set) onto the testing data9,11,26.
Statistics and reproducibility
We used one-sample t-tests (one-tailed) to compare the decoding accuracy against the chance level (20%) separately for each frequency band and each condition to detect frequency-specific representations of stimuli. To investigate whether the pattern of decoding accuracies across the 4 frequency bands differs for congruent and incongruent conditions, as hypothesized, we performed 2-condition × 4-frequency two-way ANOVAs. Two such ANOVAs were performed, one comparing the coherent and incoherent conditions, and one comparing the threshold-coherent and threshold-incoherent conditions. After that, we performed paired t-tests (two-tailed) to compare decoding accuracies between the congruent and incongruent conditions separately at each frequency band. Multiple comparisons were corrected using false discovery rate (FDR) correction (P < 0.05).
Reporting summary
Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.
Data availability
Data are available on the Open Science Framework https://doi.org/10.17605/OSF.IO/XJNRT27.
Code availability
Code is available on the Open Science Framework https://doi.org/10.17605/OSF.IO/XJNRT27.
References
Geisler, W. S. Visual perception and the statistical properties of natural scenes. Annu. Rev. Psychol. 59, 167–192 (2008).
Kaiser, D., Quek, G. L., Cichy, R. M. & Peelen, M. V. Object vision in a structured world. Trends Cogn. Sci. 23, 672–685 (2019).
Võ, M. L.-H. The meaning and structure of scenes. Vis. Res. 181, 10–20 (2021).
Hogendoorn, H. Perception in real-time: predicting the present, reconstructing the past. Trends Cogn. Sci. 26, 128–141 (2022).
de Vries, I. E. J. & Wurm, M. F. Predictive neural representations of naturalistic dynamic input. Nat. Commun. 14, 3858 (2023).
van Kerkoerle, T. et al. Alpha and gamma oscillations characterize feedback and feedforward processing in monkey visual cortex. Proc. Natl. Acad. Sci. 111, 14332–14341 (2014).
Fries, P. Rhythms for cognition: communication through coherence. Neuron 88, 220–235 (2015).
Michalareas, G. et al. Alpha-Beta and Gamma rhythms subserve feedback and feedforward influences among human visual cortical areas. Neuron 89, 384–397 (2016).
Chen, L., Cichy, R. M. & Kaiser, D. Alpha-frequency feedback to early visual cortex orchestrates coherent naturalistic vision. Sci. Adv. 9, eadi2321 (2023).
Bastos, A. M. et al. Canonical microcircuits for predictive coding. Neuron 76, 695–711 (2012).
Chen, L., Cichy, R. M. & Kaiser, D. Coherent categorical information triggers integration-related alpha dynamics. J. Neurophysiol. 131, 619–625 (2024).
Cecere, R., Rees, G. & Romei, V. Individual differences in alpha frequency drive crossmodal illusory perception. Curr. Biol. 25, 231–235 (2015).
Samaha, J. & Postle, B. R. The speed of Alpha-band oscillations predicts the temporal resolution of visual perception. Curr. Biol. 25, 2985–2990 (2015).
VanRullen, R. Perceptual Cycles. Trends Cogn. Sci. 20, 723–735 (2016).
Leonardelli, E. et al. Prestimulus oscillatory alpha power and connectivity patterns predispose perceptual integration of an audio and a tactile stimulus. Hum. Brain Mapp. 36, 3486–3498 (2015).
Walsh, V. A theory of magnitude: common cortical metrics of time, space and quantity. Trends Cogn. Sci. 7, 483–488 (2003).
Herz, N., Baror, S. & Bar, M. Overarching states of mind. Trends Cogn. Sci. 24, 184–199 (2020).
Brainard, D. H. The Psychophysics Toolbox. Spat. Vis. 10, 433–436 (1997).
Pelli, D. G. The VideoToolbox software for visual psychophysics: transforming numbers into movies. Spat. Vis. 10, 437–442 (1997).
Watson, A. B. & Pelli, D. G. Quest: A Bayesian adaptive psychometric method. Percept. Psychophys. 33, 113–120 (1983).
Cornelissen, F. W., Peters, E. M. & Palmer, J. The Eyelink Toolbox: Eye tracking with MATLAB and the Psychophysics Toolbox. Behav. Res. Methods Instrum. Comput. 34, 613–617 (2002).
Oostenveld, R., Fries, P., Maris, E. & Schoffelen, J.-M. FieldTrip: open source software for advanced analysis of MEG, EEG, and invasive electrophysiological data. Comput. Intell. Neurosci. 2011, 156869 (2011).
Oosterhof, N. N., Connolly, A. C. & Haxby, J. V. CoSMoMVPA: multi-modal multivariate pattern analysis of neuroimaging data in Matlab/GNU Octave. Front. Neuroinform. 10, 27 (2016).
Chang, C.-C. & Lin, C.-J. LIBSVM: a library for support vector machines. ACM Trans. Intell. Syst. Technol. 2, 1–27 (2011).
Kaiser, D., Turini, J. & Cichy, R. M. A neural mechanism for contextualizing fragmented inputs during naturalistic vision. Elife 8, e48182 (2019).
Chen, L., Cichy, R. M. & Kaiser, D. Semantic scene-object consistency modulates N300/400 EEG components, but does not automatically facilitate object representations. Cereb. Cortex 32, 3553–3567 (2022).
Chen, L., Cichy, R. M. & Kaiser, D. Representational shifts from feedforward to feedback rhythms index phenomenological integration in natural vision [Dataset]. OSF https://doi.org/10.17605/OSF.IO/XJNRT (2025).
Acknowledgements
L.C. is supported by a PhD stipend from the China Scholarship Council (CSC). R.M.C is supported by the Deutsche Forschungsgemeinschaft (DFG; CI241/3-1, and INST 272/297-1) and by a European Research Council (ERC) starting grant (ERC-2018-STG 803370). D.K. is supported by the DFG (SFB/TRR135, project number 222641018; KA4683/5-1, project number 518483074; KA4683/6-1, project number 536053998), “The Adaptive Mind” funded by the Excellence Program of the Hessian Ministry of Higher Education, Science, Research and Art, and an ERC Starting Grant (ERC-2022-STG 101076057). Views and opinions expressed are those of the authors only and do not necessarily reflect those of the European Union or the European Research Council. Neither the European Union nor the granting authority can be held responsible for them. The authors thank the HPC Service of ZEDAT, Freie Universität Berlin, for computing time.
Author information
Authors and Affiliations
Contributions
Conceptualization, L.C. and D.K.; Data curation, L.C.; Formal analysis, L.C.; Funding acquisition, R.M.C. and D.K.; Investigation, L.C.; Methodology, L.C. and D.K.; Project administration, R.M.C. and D.K.; Resources, L.C.; Software, L.C.; Supervision, D.K. and R.M.C.; Validation: L.C.; Visualization, L.C.; Writing – original draft, L.C. and D.K.; Writing – review & editing, L.C., D.K. and R.M.C.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Peer review
Peer review information
Communications Biology thanks Mireia Torralba Cuello and Mats W. J. van Es for their contribution to the peer review of this work. Primary Handling Editors: Shenbing Kuang and Jasmine Pan.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Chen, L., Cichy, R.M. & Kaiser, D. Representational shifts from feedforward to feedback rhythms index phenomenological integration in naturalistic vision. Commun Biol 8, 576 (2025). https://doi.org/10.1038/s42003-025-08011-0
Received:
Accepted:
Published:
Version of record:
DOI: https://doi.org/10.1038/s42003-025-08011-0
This article is cited by
-
Decoding real-world visual scenes from alpha and gamma band flicker evoked oscillations in human EEG
Scientific Reports (2026)



