Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Advertisement

Nature Communications
  • View all journals
  • Search
  • My Account Login
  • Content Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • RSS feed
  1. nature
  2. nature communications
  3. articles
  4. article
Temporal-Spatial Fusion Vision Hardware Enables Streamlined In-Sensor Computing for Dynamic Scenes
Download PDF
Download PDF
  • Article
  • Open access
  • Published: 15 April 2026

Temporal-Spatial Fusion Vision Hardware Enables Streamlined In-Sensor Computing for Dynamic Scenes

  • Yi Wu1,2,3 na1,
  • Wenjie Deng  ORCID: orcid.org/0000-0002-7846-11012 na1,
  • Ruihao Liu2,
  • Chutian Xiao2,
  • Jianmiao Guo  ORCID: orcid.org/0000-0001-9832-42033,
  • Chaoyi Zhu  ORCID: orcid.org/0000-0001-5119-55123,
  • Qinqi Ren  ORCID: orcid.org/0000-0002-1259-63533,
  • Zehao Li2,
  • Yushan Wu2,
  • Kexin Li2,
  • Xueliang Ma2,
  • Xiaoting Wang2,
  • Zhangyang Xu2,
  • Zikang Zhao2,
  • Zhijie Chen  ORCID: orcid.org/0000-0003-3988-64462,
  • Yang Chai  ORCID: orcid.org/0000-0002-8943-08613 &
  • …
  • Yongzhe Zhang  ORCID: orcid.org/0000-0002-3471-44021,2 

Nature Communications (2026) Cite this article

  • 4504 Accesses

  • Metrics details

We are providing an unedited version of this manuscript to give early access to its findings. Before final publication, the manuscript will undergo further editing. Please note there may be errors present which affect the content, and all legal disclaimers apply.

Subjects

  • Imaging and sensing
  • Optical properties and devices
  • Sensors and biosensors

Abstract

Time and space constitute fundamental dimensions of physical reality, making their integrated processing crucial for advanced vision perception systems. Current visual information processing faces dual limitations: von Neumann architecture-induced data-transfer bottlenecks and spatial-feature processing often disregard temporal dynamics, while temporal analyzers oversimplify spatial complexity. Here we propose an artificial vision hardware enabling intrinsic temporal-spatial fusion through voltage-tunable temporal differentiation with microsecond-scale resolution and photoresponse-weighted spatial compression via pixel binning. The architecture achieves millisecond-level latency from sensing to decision in autonomous driving scenarios through in-sensor spatiotemporal fusion, eliminating external computing dependencies. Experimental validation demonstrates 95 % recognition accuracy in human actions database while the operation counts required is only 1/10 of conventional convolutional processing. This work facilitates physical-level spatiotemporal fusion through the co-optimization of photodetector arrays and weighted control circuits, which could fundamentally reshape machine vision architectures with potential extensions to real-time decision systems.

Similar content being viewed by others

Self-reconfigurable polarization perception in dual-anisotropy heterostructures for high-dimensional in-sensor computing

Article 14 April 2026

In-sensor analog optoelectronic processing of concurrent event and memory signals for dynamic vision sensing

Article Open access 26 December 2025

Bioinspired high-order in-sensor spatiotemporal enhancement in van der Waals optoelectronic neuromorphic electronics

Article Open access 02 October 2025

Data availability

The data that support the findings of this study are presented in the paper and the Supplementary Information. Source data are provided with this paper.

Code availability

The codes that support the findings of this study are available from the corresponding authors on request.

References

  1. Van Essen, D. C. & Gallant, J. L. Neural mechanisms of form and motion processing in the primate visual system. Neuron 13, 1–10 (1994).

    Google Scholar 

  2. Chen, G. & Gong, P. A spatiotemporal mechanism of visual attention: superdiffusive motion and theta oscillations of neural population activity patterns. Sci. Adv. 8, eabl4995 (2022).

    Google Scholar 

  3. Li, Z. et al. BEVFormer: learning bird’s-eye-view representation from multi-camera images via spatiotemporal transformers. Computer Vision – ECCV 2022. ECCV 2022. Lecture Notes in Computer Science, Vol. 13669 (eds Avidan, S., Brostow, G., Cissé, M., Farinella, G. M. & Hassner,T.) 1–18 (Springer,Cham, 2022).

  4. Chai, Y. In-sensor computing for machine vision. Nature 579, 32–33 (2020).

    Google Scholar 

  5. He, Z. et al. Perovskite retinomorphic image sensor for embodied intelligent vision. Sci. Adv. 11, eads2834 (2025).

    Google Scholar 

  6. Yang, Y. et al. In-sensor dynamic computing for intelligent machine vision. Nat. Electron 7, 225–233 (2024).

    Google Scholar 

  7. Dang, B. et al. Reconfigurable in-sensor processing based on a multi-phototransistor—one-memristor array. Nat. Electron 7, 991–1003 (2024).

    Google Scholar 

  8. Li, F. et al. An artificial visual neuron with multiplexed rate and time-to-first-spike coding. Nat. Commun. 15, 3689 (2024).

    Google Scholar 

  9. Wu, X. et al. Ultralow-power optoelectronic synaptic transistors based on polyzwitterion dielectrics for in-sensor reservoir computing. Sci. Adv. 10, eadn4524 (2024).

    Google Scholar 

  10. Huang, H. et al. Fully integrated multi-mode optoelectronic memristor array for diversified in-sensor computing. Nat. Nanotechnol. 20, 93–103 (2025).

    Google Scholar 

  11. Gao, H. et al. Bio-inspired mid-infrared neuromorphic transistors for dynamic trajectory perception using PdSe2/pentacene heterostructure. Nat. Commun. 16, 5241 (2025).

    Google Scholar 

  12. Zhou, Y. et al. Computational event-driven vision sensors for in-sensor spiking neural networks. Nat. Electron 6, 870–878 (2023).

    Google Scholar 

  13. Reissig, L., Dalgleish, S. & Awaga, K. A differential photodetector: Detecting light modulations using transient photocurrents. AIP Adv. 6, 015306 (2016).

    Google Scholar 

  14. Herrera, C. T. & Labram, J. G. Quantifying the performance of perovskite retinomorphic sensors. J. Phys. D 54, 475110 (2021).

    Google Scholar 

  15. Al Mahfuz, M. M., Islam, R. & Ko, D.-K. Artificial Amacrine Retinal Circuits. ACS Appl. Mater. Interfaces 16, 46454–46460 (2024).

    Google Scholar 

  16. Kumar, M., Park, H. & Seo, H. A single-pixel event photoactive device for real-time, in-sensor spatiotemporal optical information processing. Adv. Mater. 37, 2406607 (2024).

    Google Scholar 

  17. Yamamoto, H. et al. Modular architecture facilitates noise-driven control of synchrony in neuronal networks. Sci. Adv. 9, eade1755 (2023).

    Google Scholar 

  18. Sinha, M. & Narayanan, R. Active dendrites and local field potentials: biophysical mechanisms and computational explorations. Neuroscience 489, 111–142 (2022).

    Google Scholar 

  19. Yi, G., Wang, J., Wei, X. & Deng, B. Action potential initiation in a two-compartment model of pyramidal neuron mediated by dendritic Ca2+ spike. Sci. Rep. 7, 45684 (2017).

    Google Scholar 

  20. Mennel, L. et al. Ultrafast machine vision with 2D material neural network image sensors. Nature 579, 62–66 (2020).

    Google Scholar 

  21. Wang, C.-Y. et al. Gate-tunable van der Waals heterostructure for reconfigurable neural network vision sensor. Sci. Adv. 6, eaba6173 (2020).

    Google Scholar 

  22. Yao, S. et al. Radar-camera fusion for object detection and semantic segmentation in autonomous driving: a comprehensive review. IEEE Trans. Intell. Veh. 9, 2094–2128 (2024).

    Google Scholar 

  23. Wu, Y. et al. CMOS-compatible retinomorphic Si photodetector for motion detection. Sci. China Inf. Sci. 66, 162401 (2023).

    Google Scholar 

  24. Chen, G. et al. Event-based neuromorphic vision for autonomous driving: a paradigm shift for bio-inspired visual sensing and perception. IEEE Signal Process. Mag. 37, 34–49 (2020).

    Google Scholar 

  25. Liu, L. et al. Computing systems for autonomous driving: state of the art and challenges. IEEE Internet Things J. 8, 6469–6486 (2021).

    Google Scholar 

  26. Kim, M.-K., Kim, I.-J. & Lee, J.-S. CMOS-compatible compute-in-memory accelerators based on integrated ferroelectric synaptic arrays for convolution neural networks. Sci. Adv. 8, eabm8537 (2022).

    Google Scholar 

  27. Wu, C. et al. Programmable phase-change metasurfaces on waveguides for multimode photonic convolutional neural network. Nat. Commun. 12, 96 (2021).

    Google Scholar 

  28. Gorelick, L., Blank, M., Shechtman, E., Irani, M. & Basri, R. Actions as space-time shapes. IEEE Trans. Pattern Anal. Mach. Intell. 29, 2247–2253 (2007).

    Google Scholar 

  29. Wu, Y. et al. A spiking artificial vision architecture based on fully emulating the human vision. Adv. Mater. 36, 2312094 (2024).

    Google Scholar 

Download references

Acknowledgements

This work was supported by the National Key Research and Development Project of China (2023YFB2806701 W.D.), the National Natural Science Foundation of China under Grant (U23A20357 Y.Z., 62334001 Y.Z., 62305013 W.D., and 62574019 W.D.), and the China National Postdoctoral Program for Innovative Talents (No. BX20230033 W.D.).

Author information

Author notes
  1. These authors contributed equally: Yi Wu, Wenjie Deng.

Authors and Affiliations

  1. State Key Laboratory of Materials Low-Carbon Recycling, College of Materials Science and Engineering, Beijing University of Technology, Beijing, 100124, China

    Yi Wu & Yongzhe Zhang

  2. Key Laboratory of Optoelectronics Technology of Education Ministry of China, School of Integrated Circuits, Beijing University of Technology, Beijing, 100124, China

    Yi Wu, Wenjie Deng, Ruihao Liu, Chutian Xiao, Zehao Li, Yushan Wu, Kexin Li, Xueliang Ma, Xiaoting Wang, Zhangyang Xu, Zikang Zhao, Zhijie Chen & Yongzhe Zhang

  3. Department of Applied Physics, The Hong Kong Polytechnic University, Kowloon, Hong Kong, China

    Yi Wu, Jianmiao Guo, Chaoyi Zhu, Qinqi Ren & Yang Chai

Authors
  1. Yi Wu
    View author publications

    Search author on:PubMed Google Scholar

  2. Wenjie Deng
    View author publications

    Search author on:PubMed Google Scholar

  3. Ruihao Liu
    View author publications

    Search author on:PubMed Google Scholar

  4. Chutian Xiao
    View author publications

    Search author on:PubMed Google Scholar

  5. Jianmiao Guo
    View author publications

    Search author on:PubMed Google Scholar

  6. Chaoyi Zhu
    View author publications

    Search author on:PubMed Google Scholar

  7. Qinqi Ren
    View author publications

    Search author on:PubMed Google Scholar

  8. Zehao Li
    View author publications

    Search author on:PubMed Google Scholar

  9. Yushan Wu
    View author publications

    Search author on:PubMed Google Scholar

  10. Kexin Li
    View author publications

    Search author on:PubMed Google Scholar

  11. Xueliang Ma
    View author publications

    Search author on:PubMed Google Scholar

  12. Xiaoting Wang
    View author publications

    Search author on:PubMed Google Scholar

  13. Zhangyang Xu
    View author publications

    Search author on:PubMed Google Scholar

  14. Zikang Zhao
    View author publications

    Search author on:PubMed Google Scholar

  15. Zhijie Chen
    View author publications

    Search author on:PubMed Google Scholar

  16. Yang Chai
    View author publications

    Search author on:PubMed Google Scholar

  17. Yongzhe Zhang
    View author publications

    Search author on:PubMed Google Scholar

Contributions

Y.Z., W.D., Y.W., and Y.C. conceived the concept and designed the experiments. W.D., Y.C., Y.Z. supervised the project. Y.W. fabricated the devices. Z.C., Y.W., R.L., C.X., Z.L., and Y.S.W. design weight control circuits. Y.W., J.G., C.Z., Q.R., X.M., Z.X., and Z.Z. performed the optoelectronic measurements. Y.W., D.W., K.L., X.W., Z.C., Y.C., and Y.Z. analyzed the data. Y.W. and W.D. wrote the paper. All the authors discussed the results and implications and reviewed the paper.

Corresponding authors

Correspondence to Wenjie Deng, Zhijie Chen, Yang Chai or Yongzhe Zhang.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Communications thanks Hyeok Kim, Chengkuo Lee, and Haotong Wei for their contribution to the peer review of this work. A peer review file is available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information (download PDF )

Transparent Peer Review file (download PDF )

Source data

Source Data (download XLSX )

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wu, Y., Deng, W., Liu, R. et al. Temporal-Spatial Fusion Vision Hardware Enables Streamlined In-Sensor Computing for Dynamic Scenes. Nat Commun (2026). https://doi.org/10.1038/s41467-026-71907-w

Download citation

  • Received: 19 June 2025

  • Accepted: 01 April 2026

  • Published: 15 April 2026

  • DOI: https://doi.org/10.1038/s41467-026-71907-w

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Download PDF

Associated content

Collection

Neuromorphic Vision Systems

Advertisement

Explore content

  • Research articles
  • Reviews & Analysis
  • News & Comment
  • Videos
  • Collections
  • Subjects
  • Follow us on Facebook
  • Follow us on X
  • Sign up for alerts
  • RSS feed

About the journal

  • Aims & Scope
  • Editors
  • Journal Information
  • Open Access Fees and Funding
  • Calls for Papers
  • Editorial Values Statement
  • Journal Metrics
  • Editors' Highlights
  • Contact
  • Editorial policies
  • Top Articles

Publish with us

  • For authors
  • For Reviewers
  • Language editing services
  • Open access funding
  • Submit manuscript

Search

Advanced search

Quick links

  • Explore articles by subject
  • Find a job
  • Guide to authors
  • Editorial policies

Nature Communications (Nat Commun)

ISSN 2041-1723 (online)

nature.com footer links

About Nature Portfolio

  • About us
  • Press releases
  • Press office
  • Contact us

Discover content

  • Journals A-Z
  • Articles by subject
  • protocols.io
  • Nature Index

Publishing policies

  • Nature portfolio policies
  • Open access

Author & Researcher services

  • Reprints & permissions
  • Research data
  • Language editing
  • Scientific editing
  • Nature Masterclasses
  • Research Solutions

Libraries & institutions

  • Librarian service & tools
  • Librarian portal
  • Open research
  • Recommend to library

Advertising & partnerships

  • Advertising
  • Partnerships & Services
  • Media kits
  • Branded content

Professional development

  • Nature Awards
  • Nature Careers
  • Nature Conferences

Regional websites

  • Nature Africa
  • Nature China
  • Nature India
  • Nature Japan
  • Nature Middle East
  • Privacy Policy
  • Use of cookies
  • Legal notice
  • Accessibility statement
  • Terms & Conditions
  • Your US state privacy rights
Springer Nature

© 2026 Springer Nature Limited

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing