Sub-picojoule-per-bit volitional neuromorphic devices for precise targeting and tracking

Huang, Yixuan; Sun, Qihao; Dai, Fuxing; Wang, Fang; Zhou, Xiangyu; Huang, Chenyu; Wu, Yanlin; Deng, Yizhe; Luo, Li; Li, Xiao; Li, Chuang; Ren, Wuyang; Ren, Aobo; Fu, Xiao; Shen, Kai; Hu, Weida; Wu, Jiang

doi:10.1038/s41467-025-66295-6

Download PDF

Article
Open access
Published: 09 January 2026

Sub-picojoule-per-bit volitional neuromorphic devices for precise targeting and tracking

Nature Communications volume 17, Article number: 339 (2026) Cite this article

3442 Accesses
Metrics details

Subjects

Abstract

The technological revolution driven by artificial intelligence has significantly improved the hardware performance, but energy consumption remains a critical bottleneck. The state-of-the-art retinomorphic devices, as core components of artificial intelligence hardware, excel in feature extraction but are constrained by passive attention mechanisms that lack flexibility of actively extracting additional features. Inspired by the human visual system, this work introduces a volitional neuromorphic device with active volitional attention regulation. By leveraging gate-voltage-tunable photoconductance to generate adjustable differential spectral response and employing neural networks to evaluate spectral reconstruction accuracy, the device achieves selective task perception. Experimental results demonstrate a data compression ratio of 1.17% and an extreme information energy efficiency of 0.625 pJ/bit. This advancement not only advances retinomorphic hardware design but also provides a sustainable pathway for energy-efficient hyperspectral imaging and next-generation neuromorphic computing systems.

Strain-insensitive viscoelastic perovskite film for intrinsically stretchable neuromorphic vision-adaptive transistors

Article Open access 10 April 2024

Photo-induced non-volatile VO₂ phase transition for neuromorphic ultraviolet sensors

Article Open access 01 April 2022

Large-scale neuromorphic optoelectronic computing with a reconfigurable diffractive processing unit

Article 12 April 2021

Introduction

The current landscape of technological revolution, featured by the astonishing achievements of artificial intelligence (AI) technologies, is ushering in a groundbreaking epoch^1,2,3. State-of-the-art AI hardware, rooted in the classic Von Neumann architecture, showcases unparalleled performance but at a considerable energy cost^4,5,6. For instance, cutting-edge H100 GPUs are projected to consume over 13 terawatt-hours of energy annually^7,8, underscoring a critical concern for energy efficiency, particularly in data-driven era.

Human visual system, renowned for its remarkable efficiency in perceiving, transducing, and interpreting information at approximately 1.0 pJ/bit^9,10,11,12, provides a promising avenue to tackle the aforementioned concern. This efficiency stems from the intricate mechanisms of visual attention, encompassing both passive autonomic attention (PAA) and active volitional attention (AVA)^13,14. As illustrated in Fig. 1a, PAA, naturally attracted by external stimuli, preferentially dominates the process of forming ideology directly from significant signal attention. It involves the fundamental light signal receiving, converting and pre-processing, realizing the primary motion detection. The AVA is further intervened as the subjective selecting command is sent to the brain’s central thinking region guided by prior knowledge and objectives. Accordingly, visual attention is directed through priority maps to efficiently focus on key information while ignoring irrelevant stimuli, which facilitates selective feature extraction and thus an extreme energy efficiency. Leveraging insights from the human visual system holds immense significance in the development of intelligent devices^15,16,17,18.

**Fig. 1: The principle of implementation path and characterization based on the volitional neuromorphic devices.**

Encouragingly, the integration of visual attention into two-dimensional materials-based retinomorphic vision devices has demonstrated milestone breakthroughs, navigating versatile and complicated scenarios, such as intelligent imaging, in-sensor computing, all-in-one hardware and etc^{19,20,21,22,23,24,25,26}. Nevertheless, they predominantly rely on PAA, constraining their capacity for efficient feature extraction. This results in redundant sensory data and heightened power consumption, especially in multi-sport scenarios such as road tracking, biology follows, adaptive cruise control, etc²⁷. which highlights the critical necessity of developing volitional neuromorphic devices with exceptional energy efficiency²⁸.

Here, we demonstrate volitional neuromorphic devices with extreme energy efficiency of sub-picojoules per bit by emulating the hierarchical functions of the human visual system. In addition to PAA for dynamic feature extraction, AVA empowers our devices to precisely target specific objects and track their trajectories based on spectral features with an average accuracy over 93%. Such high precision originates from the active feedback and correction attained by optimizing gate-tunable differential spectral photo-response, inspired by biological transsaccadic memory. As a result, the volitional neuromorphic devices exhibit a data compression ratio of 1.17%, minimizing redundant data while approaching the IEE of the human visual system at 0.625 pJ/bit. This advancement is poised to redefine the landscape of AI hardware development, emphasizing brain-like energy efficiency in non-Von Neumann architecture-based systems.

Results

Design of volitional neuromorphic devices

Given that the participation of AVA enables a high operation efficiency and low power consumption, we proposed volitional neuromorphic devices with extreme energy efficiency by simulating the fovea response to regions of interest, configured with a van der Waals heterostructure of MoSe₂/h-BN/MoS₂ (Fig. S1). The structural, interfacial, morphological and elementary characteristics are evaluated by conducting transmission electron microscope (TEM), Raman, X-ray photoelectron spectroscopy (XPS), atomic force microscope (AFM) and energy dispersive X-ray spectroscopy (EDS), respectively. As shown in Fig. 1c, clear interfaces and hierarchical element distribution are observed in the vdWs heterostructure. The corresponding characteristic E_2g (383 cm⁻¹, 233 cm^–1 and 1350 cm^–1) and A_1g (408 cm^–1) peaks are identified for few-layer MoS₂ flakes (as floating gate), MoSe₂ flakes (as conduction channel) and h-BN flakes (as potential barrier), respectively (Fig. 1d)^29,30,31. To further evaluate the interfacial characteristics, the resolved XPS spectra of Mo 3d, Se 3d, S 2p, B 1s and N 1s core levels are analyzed in Fig. 1e^29,30,31, and the symmetric peak profiles of h-BN suggest that vdWs interactions predominantly govern the interfacial characteristics³². In addition, the morphology profile and element distribution are recorded in Fig. S2, S3, further confirming the successful fabrication of proposed heterostructure with floating gate. As the aforementioned AVA in human visual system (Fig. 1a), the fovea reflects the polarity peak change of the cone photoreceptor, ensuring the fundamental information pre-processing. When an attention is focused on a region of interest, the corresponding neuronal potential activity is immediately enhanced and feeds back to the cone photoreceptors, promoting selective task perception and cognition. Similarly, as shown in Fig. 1b, the device-level AVA is proposed by introducing an active spectral feedback circulation, which progressively manipulates voltage feedback until outputs optimal spectral reconstruction accuracy. Specifically, MoSe₂ photoactive channel receives optical signals and simulates the weight of synapse, and MoS₂ conduction channel maintains the persistent current to realize the memory function. Subsequently, programmable gate voltage (V_g) pulses, serving as electrical stimulus, are applied to modulate the photoconductance, which generate both non-volatile positively and negatively photoconductive photocurrent (PPC and NPC) over visible spectrum to achieve polarity regulation (similar to cone cells). The differential operation between NPC and PPC photoconductive currents enables polarity-regulated spectral reconstruction by synergistically suppressing interference and amplifying target signals. For the same reason, to observe specific objects within all spectral information, the input voltage is carefully adjusted in the manner of designated optimal V_g for each wavelength, referring to the specific wavelength with the maximum reconstructed accuracy. Accordingly, only the spectral signal of specific interest can be emphasized instead of another irrelevant spectrum. This aligns with the natural way human processes and prioritizes visual information, ultimately enabling the targeting and localization of objects (e.g., a runner wearing a blue shirt). This can maintain 93% spectral recognition accuracy even under 1.17% data compression by dynamically balancing noise suppression and feature retention. Such operation guarantees an intelligent detection with subjective feature selections, which significantly reduces irrelevant redundant sensory data while effectively allocates processing resources to reduce energy consumption. As a result, AVA-involved volitional neuromorphic device adheres to the intelligent philosophy of the human brain that prioritizes appointed spectral sensory inputs over others, leading to an extreme energy efficiency, which provides an innovative way to establish autonomous selection of targeting and tracking.

Versatile characteristics of volitional neuromorphic devices

The AVA-involved sensing, memory, and computing capabilities are essential prerequisites in volitional neuromorphic devices, which ensure accurate targeting and tracking with extreme energy efficiency^33,34. The intrinsic responsive characteristics are priorly examined under illumination with primary colors, as shown in Fig. 2a. The as-fabricated device demonstrates an obvious photoresponse that over two orders of magnitude larger than dark current, accompanied by exceptional responsivity, detectivity and noise-equivalent-power with a low incident light power of 0.1 mW (Figs. S4–S6). The photoconductive behaviors are subsequently evaluated by applying programmable V_g pulses (Fig.2b, c). A single V_g pulse is initially applied for 1 s, and the light is simultaneously switched on once disconnecting the gate voltage. It is clear that both PPC and NPC are of progressive output, good uniformity and reproducibility. Notably, their response are much more prompt (rise time <160 μs, Fig. S7) than HVS (~50 ms), which effectively protects from capturing ghost images, especially in the scenario of high-speed motion recognition^35,36. Good photoresponsive characteristics and floating gate effect are also validated in the manner of pulse number and bias voltage (Figs. S8–S10). Meanwhile, this device possesses a linear memory window of ~80 V with a large stored charge density of 5.65 × 10¹²cm^–2, and maintains a good non-volatile stability (Fig. S11), which is beneficial for successive differential operation. These attractive data-perceiving and storing capabilities are originated from the appropriate arrangement of vdWs heterojunction that allows an efficient Fowler-Nordheim tunneling. The corresponding band alignment and carrier dynamics are elucidated in Fig. S12. Briefly, electrons and holes initially undergo accumulating (MoSe₂ conduction channel) and trapping (MoS₂ modulation channel) processes as applying negative V_g. The electrons subsequently tunnel to MoS₂ followed by the recombination with the trapped holes upon light illumination. Accordingly, the decreased number of electrons weakens the photoconductivity, resulting in a reduced photocurrent (i.e., NPC) and vice versa for PPC³⁷.

**Fig. 2: Photoconductivity properties of volitional neuromorphic devices.**

The corresponding spectral response database is established via adequately mapping the photoresponse as functions of spectrum (450–800 nm) and gate voltage (−40 to 40 V), laying a foundation for the proposed device-level AVA feedback. As shown in Fig. 2d, PPC clearly transforms to NPC as tuning the direction of in-vertical electric field by switching V_g from positive to negative, which is consistent with aforementioned device physics. Notably, both PPC and NPC possess a gradual response to both distinctive wavelength and gate voltage, indicating effective photoconductive modulation that enables the subsequent differential operation and thus spectral reconstruction. As mentioned earlier, differential current (I_diff) obeying Kirchhoff’s law is essential for realizing visual attention, and the smaller I_diff, the better object contour clarity, as evidenced in Fig. S13^23,38. Meanwhile, differential signals can suppress common-mode noise and lead to a higher information compression ratio^39,40. In addition, spatial-temporal operations (edge enhancement and extraction) of static and dynamic objects are validated by manipulating PPC/NPC characteristics of the device (Figs S14–S17). In this case, the distribution of optimal gate voltages (${V}_{{{\rm{g}}}}^{{I}_{{{\rm{diff}}}-\min }}$) corresponding to each wavelength with an interval of 10 nm is recorded as the differential current reaches its minima (I_diff-min), as shown in Fig. 2e. It is worth noting that each color equips its corresponding optimal set of gate voltage, e.g., blue (4 and −14 V), green (8 and −32 V) and red (10 and −34 V). By legitimately analyzing the statistics of ${V}_{{{\rm{g}}}}^{{I}_{{{\rm{diff}}}-\min }}$ distribution, it is reasonably believed derived operation condition of primary colors can facilitate the subsequent implementation of AVA with adjustable feedback.

Spectral active feedback mechanism

With the aforementioned differential current response for different wavelengths in the volitional neuromorphic devices, the device-level AVA with spectral active feedback is conceptualized. This operation leverages the dynamic modulation capabilities of neuromorphic devices to enhance the accuracy of motion spectra and thus focus on objects. Therefore, the crucial aspect of implementation lies in the reconstruction of the object spectrum. The reconstruction process is briefly outlined in Fig. 3a–c. The single differential photocurrent matrix (I_diff-pc) with different V_g and wavelength is inputted, and each unknown reflection image input yields a reconstructed spectrum after neural network training via gradient descent, explained by the formula below:

$${I}_{{{\rm{diff}}}-{{\rm{pc}}}}=\varGamma ({V}_{{{\rm{positive}}}},{V}_{{{\rm{n}}}{{\rm{eg}}}{{\rm{a}}}{{\rm{tive}}}}){S}_{{{\rm{vector}}}}(\lambda )+\nu$$

(1)

where Г(V_positive, V_negative) is the spectral response matrix, S_vector(λ) is a vector representing the spectrum, which is dependent on the wavelength resolution, and ν denotes noise. By performing a comparative analysis between the reconstructed spectra S_vector(λ) and the corresponding reference spectra S_vector(λ) within the training dataset, the neural network undergoes optimization through the resolution of Eq. 2:

$${S}_{{{\rm{vector}}}}(\lambda )={argmin}\parallel \varGamma ({V}_{{{\rm{positive}}}},{V}_{{{\rm{negative}}}}){S}_{{{\rm{vector}}}}(\lambda )-{I}_{{{\rm{pc}}}}{\parallel }_{2}^{2}$$

(2)

**Fig. 3: Schematic of the working principle of spectral reconstruction.**

During training, a combination of Mean Squared Error and L1 norm was used as the loss function. After only 79 rounds of training, the model ultimately stabilized, minimizing convergence issues and restricting the solution space (Figs. S18, S19)^41,42. Utilizing the preferred light source (xenon lamp), simulative reconstruction of different transmission spectra can be obtained in Fig. 3c and Fig. S20. We observed that the spectrum reconstructed from the optimized neural network agrees well with the measured reference spectrum. To further evaluate the model performance, the reconstruction error of the model at various wavelengths is recorded, as presented in Fig. 3d. Across the entire dataset, the model achieved an accuracy rate of 92.2% with a spectral resolution of approximately 0.24 nm.

The accuracy of spectral reconstruction hinges on voltage and the resulting I_diff-pc, and the lower I_diff-pc can enhance the effectiveness of motion recognition. By optimizing the voltage corresponding to I_diff-pc, accuracy can be improved for specific wavelengths, thereby modulating signals from desired motion targets. This approach effectively suppresses non-target spectral signals, highlighting signals from spectral motion target, the corresponding process is illustrated in Fig. 3d–f. To begin, we establish the average reconstruction accuracy as the baseline for each wavelength across all voltage combinations in Fig. 3d. Taking the spectral information at 733 nm as an example, the V_g mapping for this wavelength can be derived (Fig. 3e). By sorting the differential currents, the minimum differential photocurrent and the corresponding optimal set of gate voltages are obtained. Multiple minima of I_diff-pc values are identified, guiding the subsequent spectral reconstruction process for each voltage combination. By subtracting the established average accuracy curve from the accuracy spectrum obtained under the voltage combination corresponding to the minimum I_diff-pc, we reveal the accuracy variation across the spectrum. This comparison aids in assessing whether the wavelength achieves optimal improvement in spectral reconstruction accuracy. As depicted in Fig. 3e, spectral reconstruction accuracy peaks at 733 nm when utilizing the specific voltage combination of (–20, 38 V). This optimal voltage configuration is determined through iterative optimization, reflecting the device’s learning process and serving as a crucial condition for subsequent operational “consciousness” formation of the device. Furthermore, the same procedure demonstrates maximum reconstruction accuracy in red, green, and blue light (Fig. 3f), further validating the effectiveness of this method across different wavelengths.

Active target demonstration

The device-level AVA is believed to easily demonstrate its advantages in numbers of scenarios to actively identify the desired target among multiple moving objects. Building upon feature extraction, it can achieve fusion and superposition of features through accumulation and averaging. This operation is beneficial for obtaining trajectories of multispectral moving objects captured in the images. The proof-of-concept demonstration is illustrated in Fig. 4a–c, sustained attention is illustrated for individuals moving in red, green, and blue colors. The 3D plots effectively display each person’s trajectory in a distinct color, with the backdrop of a tiled floor for spatial reference. The figures clearly delineate the paths of the individuals, demonstrating the system’s ability to track them accurately amidst potential distractions. It is evident that the information captured in these images is clear and precise, offering a stark contrast to the data obtained through PAA, which appears cluttered and chaotic (Fig. S21). In addition, the extraction of moving objects and color tracking were demonstrated through single device whiskbroom scanning system in Figs. S22, S23. Figure 4d–f illustrates the recognition accuracy of the device over time at different wavelengths. For all frame rates, the recognition accuracy often exceeds 93%, with small fluctuations and relatively few outliers, indication of reliable recognition performance. The AVA significantly enhances image compression by prioritizing areas with high informational content, thereby optimizing storage and improving the efficiency of data transmission without sacrificing key visual details. Dividing the pixels in the grayscale range of 0–256 into 20 equally spaced statistical units. The spectral reconstruction mechanism achieves high-efficiency data compression by selectively preserving the target object’s characteristic spectral channels. Key principles include: (1) A threshold comparator filters pixels with effective edges in Fig. 4g (lower part). (2) The original image spectrum is compressed to only the color channel of the target object in Fig. 4g (upper part). Static backgrounds and irrelevant moving objects are ignored, which can significantly reduce data volume, ultimately achieving a compression ratio of 1.17%, which outperforms other neuromorphic devices as summarized in Table S4 (Detailed calculations in Supplementary Information). In addition, the AVA mechanism contributes to low energy consumption by selectively processing only relevant information. The information-energy efficiency (IEE) is comprehensively evaluated through the power consumption aspect of human like visual information processing, quantified in joules per bit(J/bit)^10,43,44. Utilizing an evaluation method based on the resting state and action potential of human brain neurons (Fig. S24), the IEE of our volitional neuromorphic device can be as low as 0.625 pJ/bit. This efficiency is comparable to the energy consumption of neurons in the human retina, which is approximately 0.714 pJ/bit, and is slightly higher than the energy consumption observed in the neurons of mouse and Drosophila, as shown in Fig. 4h^10,45,46. Meanwhile, IEE demonstrates obvious advantages compared to other neuromorphic devices (Table S5). This comparison highlights the potential of the AVA mechanism to approach near-biological levels of energy efficiency in AI systems. Notably, a good device-to-device repeatability has been substantiated via additional validations on vdWs heterostructure, device performance and accuracy simulation (Figs. S25–29). By employing AVA, a more comprehensive and refined understanding of the scene was achieved, facilitating effective tracking and analysis of moving objects with multispectral features.

**Fig. 4: Simulation of AVA mechanism to recognize and select target moving objects.**

We have successfully demonstrated a sub-picojoule-per-bit volitional neuromorphic device by introducing an AVA. The intervened AVA operation with active feedback and real-time correction, leveraging gate-tunable differential spectral characteristics, enables our device precisely identify and track specific objects with an impressive accuracy over 93%. As a result, the volitional neuromorphic device boasts a data compression ratio of 1.17%, which significantly reduces redundant data and achieves an extreme information energy efficiency of 0.625 pJ/bit. These advancements mark a major breakthrough in neuromorphic engineering, highlighting the potential to redefine the future of AI hardware with brain-like efficiency.

Methods

Materials: Heavily p-doped Si substrates coated with SiO₂ layer (90 nm) were purchased from Corning Inc. The MoSe₂, h-BN and MoS₂ were supplied by MaiTa Corp. (Nanjing, China). Acetone (IPA, anhydrous, 99.5%), isopropanol (anhydrous, 99.5%), ethanol (anhydrous, 99.9%) were purchased from Aladdin.

Device fabrication: Thoroughly clean the SiO₂/Si substrate in acetone, IPA, and ethanol using ultrasonic treatment for 10 min. Use Scotch tape to mechanically exfoliate off MoS₂ flakes, h-BN, and MoSe₂ flakes, and sequentially transfer them onto a SiO₂/Si substrate with PDMS assistance, heating to 75 °C, 65 °C, and 75 °C, respectively, during each transfer. The electrode patterning was carried out by ultraviolet lithography of the MoSe₂/h-BN/MoS₂. Then, the Cr/Au (10/60 nm) contact pads were deposited by electron beam evaporation, followed by a standard lifted-off process in acetone. In order to improve the ohmic contacts between Au electrodes and MoSe₂, the as-fabricated devices were annealed at 300 °C in the argon atmosphere. Furthermore, we have successfully fabricated a 3 × 3 array device (Figs. S30, S31). Despite slight device-to-device variations (Fig. S32), the essential functional switching mechanism remains consistently robust. Future work will therefore focus on refining heterostructure fabrication process to enhance device-to-device reproducibility and fabrication variability, which represents one of the critical steps for scaling towards intelligent systems.

Material and device characterizations: Raman, PL, and AFM (Atomic Force Microscope) of MoSe₂/h-BN/MoS₂ were measured by a Raman-atomic force system (Alpha300RA, WITec) under 532 nm excitation laser diode (2 mW). TEM and Energy Dispersive X-Ray Spectroscopy (EDX) of MoSe₂/h-BN/MoS₂ were represented by Electron microscope Talos F200S and Spectrum SUPER X, respectively. The morphology and elemental mapping were measured by Scanning Electron Microscopy (SEM, ZEISS EV0MA15) and Energy Dispersive Spectrometer (EDS, SDD type 80T), respectively. XPS spectra of heterojunctions were measured using Thermo Scientific K-Alpha with an Al ka (hv = 1486.8 eV) emission source. The optoelectronic properties of the photoelectric memory were measured with the SemiProbe probe station and a semiconductor parameter analyzer (Keithley 4200), and Platform Design Automation (PDA, FS-Pro). The lasers of 450, 520 and 635 nm were emitted and controlled by a Programmable DC Power Supply (Itech Electronic, IT6100B), a function generator and an irradiatometer. The combined control of a supercontinuum light source (SuperK Compact) and tunable fiber filter (WLTF-NM-P-1550) is used to emit lasers of different wavelengths and light with varying linewidths. The noise spectral density was measured using a semiconductor parameter analyzer equipped with a noise testing module (FS-Pro, Primarius). All the measurements for devices were operated 4 ambient conditions.

Implementation of ResFCNet: In the AVA simulation framework, the neural network consists of an input layer accepting 16-dimensional feature vectors, followed by a sequence of residual blocks with progressively increasing dimensionality (256, 512, and 1024 units, respectively). Each residual block is composed of multiple dense layers with regularization and ReLU activation functions, with skip connections to preserve gradient flow in deep network configurations. The output layer is designed to produce 1650-dimensional vectors for spectral reconstruction. For a comprehensive guide on how to reproduce the specific results and figures presented in this work, including the model training (Fig. S19) and spectral reconstructions (Fig. 3c–f and Fig. S20), please refer to the Appendix: Implementation and Utilization of the ResFCNet Architecture in the Supplementary Information.

Data availability

The data that support the conclusions of this study are available from the corresponding authors upon reasonable request. Source data are provided with this paper. To ensure full transparency and reproducibility of our work, the primary dataset of ResFCNet has been deposited in the Hugging Face repository (https://huggingface.co/datasets/IFFS-ODS/SpectrumReconstruction_Diff).

Code availability

The codes used for simulation and data plotting are available from the corresponding authors upon reasonable request. The core implementation code of ResFCNet is publicly available on GitHub (https://github.com/IFFS-ODS/SpectrumReconstruction_ResFCN), with specific access details available in the Supplementary Information.

References

Krenn, M. et al. On scientific understanding with artificial intelligence. Nat. Rev. Phys. 4, 761–769 (2022).
Article PubMed PubMed Central Google Scholar
Allen, R. C. Reboot for the AI revolution. Nature 550, 324–327 (2017).
Article ADS Google Scholar
Klinger, J., Mateos-Garcia, J. C. & Stathoulopoulos, K. A narrowing of ai research? SSRN Electron. J. 379, 884–886 (2020).
Google Scholar
Mark Horowitz. Computing’s energy problem (and what we can do about it). In Proc. IEEE International Solid-State Circuits Conference, 10–14 (IEEE, 2014).
Xu, X. et al. Scaling for edge inference of deep neural networks. Nat. Electron. 1, 216–222 (2018).
Article Google Scholar
Gholami, A. et al. AI and memory wall. IEEE Micro 44, 33–39 (2024).
Article Google Scholar
Wiecha, P. R. Deep learning for nano-photonic materials—the solution to everything!? Curr. Opin. Solid State Mater. Sci. 28, 101129 (2024).
Article ADS Google Scholar
Uddin, M. G. et al. Broadband miniaturized spectrometers with a van der Waals tunnel diode. Nat. Commun. 15, 571 (2024).
Article ADS PubMed PubMed Central CAS Google Scholar
Eldred, K. C. et al. Thyroid hormone signaling specifies cone subtypes in human retinal organoids. Science 362, 1–7 (2018).
Article Google Scholar
Harris, J. J., Jolivet, R., Engl, E. & Attwell, D. Energy-efficient information transfer by visual pathway synapses. Curr. Biol. 25, 3151–3160 (2015).
Article PubMed PubMed Central CAS Google Scholar
Gollisch, T. & Meister, M. Eye smarter than scientists believed: neural computations in circuits of the retina. Neuron 65, 150–164 (2010).
Article PubMed PubMed Central CAS Google Scholar
Mehonic, A. & Kenyon, A. J. Brain-inspired computing needs a master plan. Nature 604, 255–260 (2022).
Article ADS PubMed CAS Google Scholar
Theeuwes, J. Top–down and bottom–up control of visual selection. Acta Psychol. 135, 77–99 (2010).
Article Google Scholar
Connor, C. E., Egeth, H. E. & Yantis, S. Visual attention: bottom-up versus top-down. Curr. Biol. 14, R850–R852 (2004).
Article PubMed CAS Google Scholar
Gu, L. et al. A biomimetic eye with a hemispherical perovskite nanowire array retina. Nature 581, 278–282 (2020).
Article ADS PubMed CAS Google Scholar
Liu, Y. et al. Perovskite-based color camera inspired by human visual cells. Light Sci. Appl. 12, 43 (2023).
Article ADS PubMed PubMed Central Google Scholar
Li, Q., van de Groep, J., Wang, Y., Kik, P. G. & Brongersma, M. L. Transparent multispectral photodetectors mimicking the human visual system. Nat. Commun. 10, 1–8 (2019).
Google Scholar
Zhu, Y. et al. Non-volatile 2D MoS2/black phosphorus heterojunction photodiodes in the near- to mid-infrared region. Nat. Commun. 15, 6015 (2024).
Article ADS PubMed PubMed Central CAS Google Scholar
Li, T. et al. Reconfigurable, non-volatile neuromorphic photovoltaics. Nat. Nanotechnol. 18, 1303–1310 (2023).
Article ADS PubMed CAS Google Scholar
Huang, P.-Y. et al. Neuro-inspired optical sensor array for high-accuracy static image recognition and dynamic trace extraction. Nat. Commun. 14, 6736 (2023).
Article ADS PubMed PubMed Central CAS Google Scholar
Liao, F. et al. Bioinspired in-sensor visual adaptation for accurate perception. Nat. Electron. 5, 84–91 (2022).
Article Google Scholar
Lee, S., Peng, R., Wu, C. & Li, M. Programmable black phosphorus image sensor for broadband optoelectronic edge computing. Nat. Commun. 13, 1485 (2022).
Article ADS PubMed PubMed Central CAS Google Scholar
Zhang, Z. et al. All-in-one two-dimensional retinomorphic hardware device for motion detection and recognition. Nat. Nanotechnol. 17, 27–32 (2022).
Article ADS PubMed Google Scholar
Wang, S. et al. Networking retinomorphic sensor with memristive crossbar for brain-inspired visual perception. Natl. Sci. Rev. 8, nwaa172 (2021).
Mennel, L. et al. Ultrafast machine vision with 2D material neural network image sensors. Nature 579, 62–66 (2020).
Article ADS PubMed CAS Google Scholar
Wang, C.-Y. et al. Gate-tunable van der Waals heterostructure for reconfigurable neural network vision sensor. Sci. Adv. 6, 1–7 (2020).
Google Scholar
Jiménez-Bravo, D. M., Lozano Murciego, Á, Sales Mendes, A., Sánchez San Blás, H. & Bajo, J. Multi-object tracking in traffic environments: a systematic literature review. Neurocomputing 494, 43–55 (2022).
Article Google Scholar
Chen, Y. et al. All two-dimensional integration-type optoelectronic synapse mimicking visual attention mechanism for multi-target recognition. Adv. Funct. Mater. 33, 1–16 (2023).
ADS Google Scholar
Chen, X. et al. Fabry-Perot interference and piezo-phototronic effect enhanced flexible MoS2 photodetector. Nano Res. 15, 4395–4402 (2022).
Article ADS CAS Google Scholar
Pan, S. et al. Light trapping enhanced broadband photodetection and imaging based on MoSe2/pyramid Si vdW heterojunction. Nano Res. 16, 10552–10558 (2023).
Article ADS CAS Google Scholar
Fukamachi, S. et al. Large-area synthesis and transfer of multilayer hexagonal boron nitride for enhanced graphene device arrays. Nat. Electron. 6, 126–136 (2023).
Article CAS Google Scholar
Lin, Y. C. et al. Low-energy implantation into transition-metal dichalcogenide monolayers to form Janus structures. ACS Nano 14, 3896–3906 (2020).
Article PubMed CAS Google Scholar
Tang, C. et al. From brain to movement: wearables-based motion intention prediction across the human nervous system. Nano Energy 115, 108712 (2023).
Article CAS Google Scholar
Feng, G., Zhang, X., Tian, B. & Duan, C. Retinomorphic hardware for in-sensor computing. InfoMat 5, e12473 (2023).
Daly, S. et al. Temporal deployment of attention by mental training: an fMRI Study. Cogn. Affect. Behav. Neurosci. 20, 669–683 (2020).
Article PubMed Google Scholar
Luo, W. et al. End-to-end active object tracking and its real-world deployment via reinforcement learning. IEEE Trans. Pattern Anal. Mach. Intell. 42, 1317–1332 (2020).
Article ADS PubMed Google Scholar
Wang, Y. et al. Negative photoconductance in van der Waals heterostructure-based floating gate phototransistor. ACS Nano 12, 9513–9520 (2018).
Article PubMed CAS Google Scholar
Wu, G. et al. Miniaturized spectrometer with intrinsic long-term image memory. Nat. Commun. 15, 676 (2024).
Article ADS PubMed PubMed Central CAS Google Scholar
Ng, S. E. et al. Retinomorphic color perception based on opponent process enabled by perovskite bipolar photodetectors. Adv. Mater. 36, 2406568 (2024).
Article CAS Google Scholar
Han, J. et al. 2D Materials-based photodetectors with bi-directional responses in enabling intelligent optical sensing. Adv. Funct. Mater. 35, 2423360 (2025).
Article CAS Google Scholar
He, K., Zhang, X., Ren, S. & Sun, J. Deep residual learning for image recognition. In Proc. IEEE Conference on Computer Vision and Pattern Recognition 770–778. https://doi.org/10.1109/CVPR.2016.90 (IEEE, 2016).
Kim, T., Oh, J., Kim, N. Y., Cho, S. & Yun, S.-Y. Comparing Kullback-Leibler Divergence and Mean Squared Error Loss in Knowledge Distillation. In Proc. Thirtieth International Joint Conference on Artificial Intelligence 2628–2635. https://doi.org/10.24963/ijcai.2021/362 (International Joint Conferences on Artificial Intelligence Organization, 2021).
Strong, S. P., Koberle, R., de Ruyter van Steveninck, R. R. & Bialek, W. Entropy and information in neural spike trains. Phys. Rev. Lett. 80, 197–200 (1998).
Article ADS CAS Google Scholar
Dayan, P. & Abbott, L. F. Theoretical neuroscience: computational and mathematical modeling of neural systems. J. Cogn. Neurosci. 15, 154–155 (2001).
Google Scholar
Niven, J. E., Anderson, J. C. & Laughlin, S. B. Fly photoreceptors demonstrate energy-information trade-offs in neural coding. PLoS Biol. 5, e116 (2007).
Berger, T. & Levy, W. B. A mathematical theory of energy efficient neural computation and communication. IEEE Trans. Inf. Theory 56, 852–874 (2010).
Article ADS MathSciNet Google Scholar

Download references

Acknowledgements

This work was financially supported by the National Key Research and Development Program of China (2021YFA1401100, 2023YFB3611400), National Natural Science Foundation of China (52302164, 52202165 and 62304031, 62422410), Natural Science Foundation of Sichuan Province (2024NSFSC0218), and Special Funding from Sichuan Postdoctoral Research Project (308324).

Author information

These authors contributed equally: Yixuan Huang, Qihao Sun, Fuxing Dai.

Authors and Affiliations

Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, 610054, PR China
Yixuan Huang, Qihao Sun, Xiangyu Zhou, Li Luo, Chuang Li, Wuyang Ren, Aobo Ren, Kai Shen & Jiang Wu
State Key Laboratory of Infrared Physics, Shanghai Institute of Technical Physics, Chinese Academy of Sciences, Shanghai, 200083, PR China
Fuxing Dai, Fang Wang, Chenyu Huang, Yanlin Wu, Xiao Fu & Weida Hu
School of Automation Engineering, University of Electronic Science and Technology of China, Chengdu, 610054, PR China
Yizhe Deng
Department of Physics and Astronomy, The University of Manchester, Manchester, M13 9PL, UK
Xiao Li
National Graphene Institute, The University of Manchester, Manchester, M13 9PL, UK
Xiao Li
State Key Laboratory of Electronic Thin Films and Integrated Devices, University of Electronic Science and Technology of China, Chengdu, 610054, PR China
Jiang Wu
Mozi Laboratory, Zhengzhou, 450001, China
Jiang Wu

Authors

Yixuan Huang
View author publications
Search author on:PubMed Google Scholar
Qihao Sun
View author publications
Search author on:PubMed Google Scholar
Fuxing Dai
View author publications
Search author on:PubMed Google Scholar
Fang Wang
View author publications
Search author on:PubMed Google Scholar
Xiangyu Zhou
View author publications
Search author on:PubMed Google Scholar
Chenyu Huang
View author publications
Search author on:PubMed Google Scholar
Yanlin Wu
View author publications
Search author on:PubMed Google Scholar
Yizhe Deng
View author publications
Search author on:PubMed Google Scholar
Li Luo
View author publications
Search author on:PubMed Google Scholar
Xiao Li
View author publications
Search author on:PubMed Google Scholar
Chuang Li
View author publications
Search author on:PubMed Google Scholar
Wuyang Ren
View author publications
Search author on:PubMed Google Scholar
Aobo Ren
View author publications
Search author on:PubMed Google Scholar
Xiao Fu
View author publications
Search author on:PubMed Google Scholar
Kai Shen
View author publications
Search author on:PubMed Google Scholar
Weida Hu
View author publications
Search author on:PubMed Google Scholar
Jiang Wu
View author publications
Search author on:PubMed Google Scholar

Contributions

K.S., W.H., and J.W. conceived the idea and directed the collaboration and execution. Y.H. and Q.S. fabricated the devices and performed the measurements. Y.H., Q.S., F.D., F.W., L.L., Y.D., X.Z. Y. W., and C.H. analysed the experimental data. Y.H., Q.S. and Y.D. did the device simulation. Q.S., C.L., and W.R. collected the raw signals of motion images. Y.H., Q.S., F.D., F.W., L.L., W.R., and X.L. cowrote the manuscript with contributions from all the authors. All authors discussed the results and implications and commented on the manuscript at all stages.

Corresponding authors

Correspondence to Kai Shen, Weida Hu or Jiang Wu.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Communications thanks Baker Mohammad, Bowen Zhu and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. A peer review file is available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Transparent Peer Review file

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Huang, Y., Sun, Q., Dai, F. et al. Sub-picojoule-per-bit volitional neuromorphic devices for precise targeting and tracking. Nat Commun 17, 339 (2026). https://doi.org/10.1038/s41467-025-66295-6

Download citation

Received: 26 November 2024
Accepted: 03 November 2025
Published: 09 January 2026
Version of record: 09 January 2026
DOI: https://doi.org/10.1038/s41467-025-66295-6