Abstract
Despite the ubiquity of recurrent connections in the brain, their role in visual processing is less understood than that of feedforward connections. Occluded object recognition, an important cognitive capacity, is thought to rely on recurrent processing of visual information, but it remains unclear whether and how recurrent processing improves recognition of occluded objects. Using convolutional models of the visual system, we demonstrate how a distinct form of computation arises in recurrent–but not feedforward–networks that leverages information about the occluder to “explain-away” the occlusion—i.e., recognition of the occluder provides an account for missing or altered features, potentially rescuing recognition of occluded objects. This occurs without any constraint placed on the computation and is observed both across a systematic architecture sweep of convolutional models and in a model explicitly constructed to approximate the primate visual system. In line with these results, we find evidence consistent with explaining-away in a human psychophysics experiment. Finally, we developed an experimentally inspired recurrent model that recovers fine-grained features of occluded stimuli by explaining-away. Recurrent connections’ capability to explain-away may extend to more general cases where undoing context-dependent changes in representations benefits perception.
Data availability
The datasets used in this study (derived fashion MNIST, derived 3Dworld) are available in a GitHub repository (https://doi.org/10.5281/zenodo.17655370). Note that demographic information has been removed from the psychophysics dataset. This information is available by email request from the corresponding author.
Code availability
Code used in this study is available in a GitHub repository (https://doi.org/10.5281/zenodo.17655370).
References
Werner, J. S. & Chalupa, L. M. The Visual Neurosciences (MIT Press, 2004).
Werner, J. S. & Chalupa, L. M. The New Visual Neurosciences (MIT Press, 2014).
Serre, T., Oliva, A. & Poggio, T. A feedforward architecture accounts for rapid categorization. Proc. Natl. Acad. Sci. USA 104, 6424 (2007).
Riesenhuber, M. & Poggio, T. Hierarchical models of object recognition in cortex. Nat. Neurosci. 2, 1019–1025 (1999).
Fukushima, K. Neocognitron: a self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position. Biol. Cybern. 36, 193–202 (1980).
Nayebi, A. et al. Task-driven convolutional recurrent models of the visual system. Advances in Neural Information Processing Systems 31 (2018).
Kietzmann, T. C. et al. Recurrence is required to capture the representational dynamics of the human visual system. Proc. Natl. Acad. Sci. USA 116, 21854–21863 (2019).
Kubilius, J. et al. Brain-like object recognition with high-performing shallow recurrent ANNs. Advances in Neural Information Processing Systems 32 (2019).
O’Reilly, R. C., Wyatte, D., Herd, S., Mingus, B. & Jilk, D. J. Recurrent processing during object recognition. Front. Psychol. 4, 124 (2013).
Linsley, D., Kim, J., Veerabadran, V., Windolf, C. & Serre, T. Learning long-range spatial dependencies with horizontal gated recurrent units. Advances in Neural Information Processing Systems 31 (2018).
Gwilliams, L. & King, J.-R. Recurrent processes support a cascade of hierarchical decisions. eLife 9, e56603 (2020).
Mohsenzadeh, Y., Qin, S., Cichy, R. M. & Pantazis, D. Ultra-Rapid serial visual presentation reveals dynamics of feedforward and feedback processes in the ventral visual pathway. eLife 7, e36329 (2018).
Kovács, G., Vogels, R. & Orban, G. A. Selectivity of macaque inferior temporal neurons for partially occluded shapes. J. Neurosci. 15, 1984–1997 (1995).
Nielsen, K. J., Logothetis, N. K. & Rainer, G. Dissociation between local field potentials and spiking activity in macaque inferior temporal cortex reveals diagnosticity-based encoding of complex objects. J. Neurosci. 26, 9639–9645 (2006).
Kosai, Y., El-Shamayleh, Y., Fyall, A. M. & Pasupathy, A. The role of visual area V4 in the discrimination of partially occluded shapes. J. Neurosci. 34, 8570–8584 (2014).
Tang, H. et al. Spatiotemporal dynamics underlying object completion in human ventral visual cortex. Neuron 83, 736–748 (2014).
Tang, H. et al. Recurrent computations for visual pattern completion. Proc. Natl. Acad. Sci. USA 115, 8835–8840 (2018).
Rajaei, K., Mohsenzadeh, Y., Ebrahimpour, R. & Khaligh-Razavi, S.-M. Beyond core object recognition: recurrent processes account for object recognition under occlusion. PLoS Comput. Biol. 15, e1007001 (2019).
Fyall, A. M., El-Shamayleh, Y., Choi, H., Shea-Brown, E. & Pasupathy, A. Dynamic representation of partially occluded objects in primate prefrontal and visual cortex. Elife 6, e25784 (2017).
Wyatte, D., Curran, T. & O’Reilly, R. The limits of feedforward vision: recurrent processing promotes robust object recognition when objects are degraded. J. Cogn. Neurosci. 24, 2248–2261 (2012).
Goetschalckx, L. et al. Toward modeling visual routines of object segmentation with biologically inspired recurrent vision models. J. Vis. 22, 3773 (2022).
Linsley, D., Kim, J., Ashok, A. & Serre, T. Recurrent neural circuits for contour detection. In Proc. International Conference on Learning Representations (ICLR, 2020).
Thorat, S., Aldegheri, G. & Kietzmann, T. C. Category-orthogonal object features guide information processing in recurrent neural networks trained for object categorization. In Proc. SVRHM Workshop (NeurIPS, 2021).
Thorat, S., Doerig, A. & Kietzmann, T. C. Characterising representation dynamics in recurrent neural networks for object recognition. In Proc. Conference on Cognitive Computational Neuroscience (CCN, 2023).
Yamins, D. L. K. et al. Performance-optimized hierarchical models predict neural responses in higher visual cortex. Proc. Natl. Acad. Sci. USA 111, 8619–8624 (2014).
Cadena, S. A. et al. Deep convolutional models improve predictions of macaque V1 responses to natural images. PLoS Comput. Biol. 15, e1006897 (2019).
Bao, P., She, L., McGill, M. & Tsao, D. Y. A map of object space in primate inferotemporal cortex. Nature 583, 103–108 (2020).
Yuille, A. & Kersten, D. Vision as Bayesian inference: analysis by synthesis? Trends Cogn. Sci. 10, 301–308 (2006).
Kriegeskorte, N. Deep neural networks: a new framework for modeling biological vision and brain information processing. Annu. Rev. Vis. Sci. 1, 417–446 (2015).
Tang, H. & Kreiman, G. in Computational and Cognitive Neuroscience of Vision, 41–58 (Springer, 2017).
Rust, N. C. & Stocker, A. A. Ambiguity and invariance: two fundamental challenges for visual processing. Curr. Opin. Neurobiol. 20, 382–388 (2010).
Pearl, J. Causality (Cambridge University Press, 2009).
Spoerer, C. J., McClure, P. & Kriegeskorte, N. Recurrent convolutional neural networks: a better model of biological object recognition. Front. Psychol. 8, 1551 (2017).
Spoerer, C. J., Kietzmann, T. C., Mehrer, J., Charest, I. & Kriegeskorte, N. Recurrent neural networks can explain flexible trading of speed and accuracy in biological vision. PLoS Comput. Biol. 16, e1008215 (2020).
Kubilius, J. et al. CORnet: modeling the neural mechanisms of core object recognition. Preprint at bioRxiv, https://doi.org/10.1101/408385, 408385 (2018).
Xiao, H., Rasul, K. & Vollgraf, R. Fashion-mnist: a novel image dataset for benchmarking machine learning algorithms. Preprint at arXiv https://doi.org/10.48550/arXiv.1708.07747 (2017).
Shi, Xingjian, et al. Convolutional LSTM network: A machine learning approach for precipitation nowcasting. Advances in neural information processing systems 28 (2015).
Hochreiter, S. & Schmidhuber, J. Long short-term memory. Neural Comput. 9, 1735–1780 (1997).
Gan, C. et al. ThreeDWorld: a platform for interactive multi-modal physical simulation. Proceedings of the Neural Information Processing Systems Track on Datasets and Benchmarks 1, NeurIPS Datasets and Benchmarks 2021 (2021).
MACKAY, D. M. TOWARDS AN INFORMATION-FLOW MODEL OF HUMAN BEHAVIOUR*. Br. J. Psychol. 47, 30–43 (1956).
Dayan, P., Hinton, G. E., Neal, R. M. & Zemel, R. S. The Helmholtz machine. Neural Comput. 7, 889–904 (1995).
Nair, V., Susskind, J. & Hinton, G. E. Analysis-by-synthesis by learning to invert generative black boxes. International conference on artificial neural networks (2008).
Yildirim, I., Kulkarni, T. D., Freiwald, W. A. & Tenenbaum, J. B. Efficient and robust analysis-by-synthesis in vision: A computational framework, behavioral tests, and modeling neuronal representations. Annual Conference of the Cognitive Science Society (2015).
George, D. et al. A generative vision model that trains with high data efficiency and breaks text-based CAPTCHAs. Science 358, eaag2612 (2017).
Huang, Y. et al. Neural networks with recurrent generative feedback. Adv. Neural Inf. Process. Syst. 33, 535–545 (2020).
Michaelis, C., Bethge, M. & Ecker, A. One-shot segmentation in clutter. International Conference on Machine Learning (2018).
Miller, E. K. & Cohen, J. D. An integrative theory of prefrontal cortex function. Annu. Rev. Neurosci. 24, 167–202 (2001).
Goodfellow, I. et al. Generative Adversarial Nets. Advances in Neural Information Processing Systems 27 (2014).
Kriegeskorte, N., Mur, M. & Bandettini, P. A. Representational similarity analysis - connecting the branches of systems neuroscience. Front. Syst. Neurosci. 2, 4 (2008).
Ernst, M. R., Burwick, T. & Triesch, J. Recurrent processing improves occluded object recognition and gives rise to perceptual hysteresis. J. Vis. 21, 6 (2021).
Kar, K., Kubilius, J., Schmidt, K., Issa, E. B. & DiCarlo, J. J. Evidence that recurrent circuits are critical to the ventral stream’s execution of core object recognition behavior. Nat. Neurosci. 22, 974–983 (2019).
Gilpin, L. H. et al. In Proc. 2018 IEEE 5th International Conference on data science and advanced analytics (DSAA), 80–89 (IEEE, 2018).
Schacter, D. L. & Buckner, R. L. Priming and the brain. Neuron 20, 185–195 (1998).
Kar, K. & DiCarlo, J. J. Fast recurrent processing via ventrolateral prefrontal cortex is needed by the primate ventral stream for robust core visual object recognition. Neuron https://doi.org/10.1016/j.neuron.2020.09.035 (2020).
Froudarakis, E. et al. The visual cortex in context. Annu. Rev. Vis. Sci. 5, 317–339 (2019).
Abadi, A. K., Yahya, K., Amini, M., Friston, K. & Heinke, D. Excitatory versus inhibitory feedback in Bayesian formulations of scene construction. J. R. Soc. Interface 16, 20180344 (2019).
Heinke, D. & Humphreys, G. W. Attention, spatial representation, and visual neglect: simulating emergent attention and spatial memory in the selective attention for identification model (SAIM). Psychol. Rev. 110, 29 (2003).
Deng, J. et al. ImageNet: A large-scale hierarchical image database. 2009 IEEE Conference on Computer Vision and Pattern Recognition (2009).
McCloskey, M. & Cohen, N. J. in Psychology of learning and motivation, Vol 24, 109–165 (Elsevier, 1989).
Martens, F., Bulthé, J., van Vliet, C. & Op de Beeck, H. Domain-general and domain-specific neural changes underlying visual expertise. NeuroImage 169, 80–93 (2018).
Shen, J., Mack, M. L. & Palmeri, T. J. Studying real-world perceptual expertise. Front. Psychol. 5, 857 (2014).
Tanaka, J. W. & Taylor, M. Object categories and expertise: is the basic level in the eye of the beholder? Cogn. Psychol. 23, 457–482 (1991).
Harel, A., Kravitz, D. & Baker, C. Beyond perceptual expertise: revisiting the neural substrates of expert object recognition. Front. Hum. Neurosci. 7, 885 (2013).
Collins, E. & Behrmann, M. Exemplar learning reveals the representational origins of expert category perception. Proc. Natl. Acad. Sci. USA 117, 11167–11177 (2020).
Kim, J., Song, M., Jang, J. & Paik, S.-B. Spontaneous retinal waves can generate long-range horizontal connectivity in visual cortex. J. Neurosci. 40, 6584–6599 (2020).
Jaderberg, M., Simonyan, K. & Zisserman, A. Spatial Transformer Networks. Advances in Neural Information Processing Systems 28 (2015).
Kingma, D. P. & Ba, J. Adam: a method for stochastic optimization. 3rd International Conference on Learning Representations (2015).
He, K., Zhang, X., Ren, S. & Sun, J. Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification. Proceedings of the IEEE international conference on computer vision, 1026–1034 (2015).
Jozefowicz, R., Zaremba, W. & Sutskever, I. In Proc. International Conference on Machine Learning, 2342–2350.
Radford, A., Metz, L. & Chintala, S. Unsupervised representation learning with deep convolutional generative adversarial networks. 4th International Conference on Learning Representations (2016).
Zeiler, M. D., Taylor, G. W. & Fergus, R. In Proc. 2011 International Conference on Computer Vision, 2018–2025 (IEEE, 2011).
Ioffe, S. & Szegedy, C. Batch normalization: accelerating deep network training by reducing internal covariate shift. Proceedings of the 32nd International Conference on Machine Learning, 448–456 (2015).
Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I. & Salakhutdinov, R. Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15, 1929–1958 (2014).
Nair, V. & Hinton, G. E. Rectified linear units improve restricted boltzmann machines. Proceedings of the 27th International Conference on Machine Learning, 807–814 (2010).
Jarrett, K., Kavukcuoglu, K., Ranzato, M. A. & LeCun, Y. In Proc. 2009 IEEE 12th International Conference on Computer Vision, 2146–2153 (IEEE, 2009).
Glorot, X., Bordes, A. & Bengio, Y. Deep sparse rectifier neural networks. Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics, 315–323 (2011).
Maas, A. L., Hannun, A. Y. & Ng, A. Y. Rectifier nonlinearities improve neural network acoustic models. Proceedings of the 30th International Conference on Machine Learning (2013).
Peirce, J. et al. PsychoPy2: experiments in behavior made easy. Behav. Res. Methods 51, 195–203 (2019).
Acknowledgements
We thank Brett Larsen and Tyler Benster for discussions and Jon-Michael Knapp for comments on the manuscript. This work was supported by the National Institute of Health (NS113110, EB02871) and the Simons Collaboration on the Global Brain (SPI542969).
Author information
Authors and Affiliations
Contributions
B.K. and S.D. conceived and designed the study; B.K. and F.C. performed the computational experiments and analyses; B.M. implemented, ran and analyzed the psychophysics experiments. All authors discussed all results and contributed to the writing of the paper.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Peer review
Peer review information
Nature Communications thanks Tim Kietzmann, Sushrut Thorat, Dietmar Heinke and the other anonymous reviewer(s) for their contribution to the peer review of this work. A peer review file is available.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Kang, B., Midler, B., Chen, F. et al. Recurrent connections facilitate occluded object recognition by explaining-away. Nat Commun (2026). https://doi.org/10.1038/s41467-026-68806-5
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41467-026-68806-5