Bioinspired high-order in-sensor spatiotemporal enhancement in van der Waals optoelectronic neuromorphic electronics

Li, Mengjiao; Chu, Hongling; Gao, Caifang; Yang, Feng-Shou; Huang, Muyun; Miu, Lingling; Li, Jun; Ho, Ching-Hwa; Liu, Jingjing; Lin, Yen-Fu; Zhang, Jianhua

doi:10.1038/s41467-025-63873-6

Download PDF

Article
Open access
Published: 02 October 2025

Bioinspired high-order in-sensor spatiotemporal enhancement in van der Waals optoelectronic neuromorphic electronics

Nature Communications volume 16, Article number: 8801 (2025) Cite this article

Subjects

Abstract

In over-complicated machine vision, target tracking within deep learning paradigms yields inaccurate and energy-intensive outputs. Although spiking neural networks excel at processing dynamic information, challenging tracking environments demand further enhancement in feature correlation learning for efficient target tracking. Distinct from Paired-spike-timing-dependent-plasticity-based architectures, we demonstrate a visual sensor based on van der Waals phototransistors, leveraging Triplet-spike-timing-dependent plasticity to extract bioinspired high-order correlation information, through tunable light-electric cooperation and competition effect on synaptic plasticity originating from interfacial defects-dominated persistent photoconductance phenomena. The universal Triplet-spike-timing-dependent plasticity with enhanced spatiotemporal correlation learning characteristic renders spiking neural networks with better processing capabilities for confusing object classification and dynamic tracking (90.44%) tasks, excelling particularly in seamless tracking post-occlusion, furthermore experimentally validated through hardware implementation on a 6$\times$6 van der Waals phototransistor array. The offers a bottom-up methodology employing device physics to guide mapping of biorational learning for high-performance dynamic tracking towards advanced machine visual technologies.

High-performance deep spiking neural networks with 0.3 spikes per neuron

Article Open access 09 August 2024

Vertically integrated spiking cone photoreceptor arrays for color perception

Article Open access 10 June 2023

High-order dynamics in an ultra-adaptive neuromorphic vision device

Article Open access 15 August 2025

Introduction

In the era of intelligent machine vision, target recognition and tracking technology have gained significance in smart transportation, security surveillance, and autonomous driving^1,2,3. However, unpredictable dynamic environments, including occlusion scenarios, pose challenges for deep learning algorithms as they hinder the extraction of target features^4,5,6 (Fig. 1a). Conventional approaches based on convolutional neural networks (CNNs) frequently compensate for computational accuracy limitations through architectural expansion, typically by stacking additional convolutional layers^7,8,9, which places significant strain on low-energy and lightweight devices. This fundamental constraint is particularly pronounced in dynamic vision applications, where the frame-by-frame computational paradigm enforces extensive multiply-and-accumulate operations for global weight updates, making it challenging to meet the stringent real-time and energy efficiency requirements for edge deployment (Supplementary Fig. 1). In contrast, spiking neural networks (SNNs) employ an event-driven and localized weight-update principle^{10,11,12,13,14}, where information transmission occurs only when neurons fire in response to specific stimuli, with weights being asynchronously updated at critical intervals. This sparse temporal coding paradigm inherently eliminates redundant computational overhead and minimizes power-intensive global operations, making SNNs particularly efficacy for dynamic real-time tasks in edge computing. Given the inherent advantages of the SNN learning characteristics and the ever-growing complexity of dynamic image processing, a systematic investigation of biorational learning principles based on neuromorphic electronic devices offers a promising pathway for executing efficient dynamic tracking tasks without compromising architectural simplicity.

**Fig. 1: Bioinspired spike-timing-dependent plasticity (STDP) learning rules for in-sensor spiking neural networks.**

Spike-timing-dependent plasticity (STDP) is a common bioinspired learning rule of SNNs that adjusts synapse weight based on the relative timing of spikes between connected neurons^15,16,17. This rule can be implemented at the hardware level, particularly benefiting from the coordinated advancements in neuromorphic devices and diverse STDP models. For example, Leong et al. employed WSe₂ floating-gate memtransistors to simulate a set of anti-STDP features, which can reversely adjust the weights to efficiently suppress irrelevant features and improve the convergence rate of the SNN model during the training phase¹⁸. Similarly, John et al. implemented diverse STDP behaviors spanning from classic STDP, anti-STDP, to symmetric STDP through reconfigurable optoelectronic devices, providing a way to reshape the function of neural networks¹⁹. However, currently reported STDP learning rules generally consider a single timing interval between individual pairs of pre- and postsynaptic spikes, limiting their ability to capture complex temporal patterns essential for feature extraction in multi-dimensional spaces. Given the close link between STDP characteristics and computational capabilities in complex dynamic visual scenes, more efforts on strong spatiotemporal correlation learning rules enabled by neuromorphic electronic blocks should be considered^20,21,22,23.

By considering the impact of timing structure on shaping the synaptic plasticity, Triplet-STDP emerges as a sophisticated high-order learning rule that involves more spatiotemporal relationships to execute the multi-dimensional weight plasticity^24,25,26,27. This intricate timing-dependent mechanism enhances the feature extraction capability of Triplet-STDP by selectively reinforcing key connections within neural networks (Fig. 1b). However, the complexity of STDP behaviors raises significant challenges for neuromorphic electronic devices in effectively mapping high-order STDP learning rules onto networks for dynamic visual information processing—an achievement that remains unrealized^28,29,30. In this work, we demonstrate multiple-order STDP behaviors of an artificial vision system based on van der Waals (vdW) phototransistors to empower the enhanced feature processing capability of SNNs. Beneficial from the interfacial defects-dominated persistent photoconductance (PPC) behavior upon low-frequency noise analysis, both the Paired- and Triplet-STDP can be achieved in the phototransistors by multi-dimensionally tuning the coupling effect of optoelectrical and electrical properties on device conductance. As a neuromorphic vision sensor platform, the timing interval- and sequence-dependent Triplet-STDP characteristics of the optoelectronic devices are further investigated, which underscore the critical merits of vision tasks in shaping the high-order learning dynamics within SNNs. Specifically, through precise synaptic weight mapping on the fabricated vdW phototransistor array, we successfully implement a Triplet-STDP-based SNN hardware and achieve enhanced spatiotemporal feature extraction capabilities in hybrid image classification tasks. Significantly, we further integrate the Triplet-STDP mechanism within a YOLO-SNN architecture for vehicle tracking, which achieves 90.44% accuracy even under occlusion conditions. These findings provide a device physics-shaped learning rule that fosters the innovation of SNN-based architectures toward enhanced edge visual computing technologies.

Results

Bioinspired multi-order spike-timing-dependent plasticity

STDP as a pivotal neuroregulatory mechanism in bioscience, significantly influences learning, memory, and perception functions through dynamic synaptic weight adaptation²⁹. The weight-update rules in STDP are governed by precise spiking patterns (including both spike count and temporal sequence), which collectively define characteristic temporal windows that determine the direction and magnitude of synaptic modifications (Fig. 1c, d)^31,32,33. In SNNs, these timing-dependent plasticity rules govern the higher-order temporal pattern encoding capability of the network by facilitating non-linear integration of information across multiple timescales. Taking the Paired-STDP paradigm as an example, constrained by its single temporal window between pre- and postsynaptic spikes, it establishes only a unidimensional mapping axis. This scenario severely limits the spatiotemporal feature extraction capacity of networks. As visualized in Fig. 1e, this architecture leads to substantial overlap in the feature space representation of distinct input patterns with subtle temporal differences, significantly compromising classification performance. In contrast, the Triplet-STDP framework introduces a sophisticated dual-window mechanism. It enables hierarchical temporal processing—the primary window captures fundamental spike sequence correlations while the secondary window resolves finer temporal substructures within these sequences. This multi-scale integration strategy effectively projects similar input patterns into orthogonal feature subspaces (Fig. 1f), dramatically enhancing pattern discriminability by disentangling previously overlapping distributions. Thus, Triplet-STDP featured high-order temporal processing capability is expected to strengthen the sparse network connection, further endowing the SNN with enhanced event-correlation sensitivity and feature extraction characteristics to deal with the complicated dynamic image processing scenes.

Optoelectronic-coupling-dependent synaptic plasticity

Neuromorphic transistor devices rely on the controllable manipulation of conductance changes of the semiconductor channels integrated into the transistor array to map complex biological learning processes (e.g., voltage, light)^7,34,35,36. Given the challenges in demonstrating multi-order STDP behaviors for neuromorphic devices, harnessing rich physical properties to investigate the learning dynamics and build effective learning rules holds significant promise for developing advanced computing technologies. Light and electric stimuli as typical conductance modulators via rich light-electric interactions can provide ample opportunities to engineer synaptic behaviors, especially for vdW materials featuring a high surface-to-volume ratio and good optoelectronic properties^37,38,39,40. In specific, indium selenide (InSe), with its intrinsic photoresponse characteristic—stemming from its direct bandgap that enhances light absorption⁴¹, along with its high carrier mobility and low effective mass that promote efficient charge transport and collection^42,43—is frequently employed as an exemplary platform for investigating optoelectronic phenomena and biomimetic optoelectronic synaptic behaviors (Supplementary Table 1).

Hence, for diverse feature shaping and intrinsic mechanism exploration of the Triplet-STDP characteristics, in this work, we develop a vdW phototransistor-based vision sensor device, which consists of a photosensitive InSe flake serving as the semiconductor channel, a thin indium film serving as the passivation layer for the channel region, and a SiO₂/Si substrate with a 2D/3D heterostructure (Fig. 2a). The detailed device fabrication process is provided in the Method part. The corresponding optical microscope image and Raman spectrum show the typical characteristic vibrational peaks of the InSe channel (Supplementary Fig. 2). The cross-sectional Transmission electron microscopy (TEM) characterization with Energy-dispersive X-ray spectroscopy (EDS) analysis further confirms the structural constitution and high quality of the fabricated devices (Supplementary Fig. 2). Typically, the measured electrical characteristics describe n-type conduction behavior with a clockwise hysteresis loop under dark conditions, which indicates the existence of the charge-trapping states in the fabricated devices (Fig. 2b). Under light illumination, the on-current increases and the recorded transfer curves shift toward negative voltage, suggesting a controllable photo-induced n-type doping effect with different light intensities (Supplementary Fig. 3). Distinct from the irreversible doping methods, such a photo-doping behavior presents a convertible characteristic under the application of gate bias or spontaneously, which underscores the tunable synaptic plasticity via light-electric interactions (Supplementary Fig. 4).

**Fig. 2: Tunable optoelectronic coupling characteristics and Paired-STDP behavior based on vdW phototransistors.**

Basic weight-update behavior of the fabricated vdW phototransistors underscores their potential in shaping biologically plausible STDP characteristics. Under optical stimulation with intensities ranging from 0.26 to 2.63 mW cm⁻², the devices show persistent photoconductance (PPC) behavior characterized by photoconductance rise during illumination followed by exponential decay post-stimulation, consistent with photo-doped transfer characteristics (Supplementary Fig. 5a). Systematic modulation of light intensity, pulse width, and pulse number induces a controllable transition from short-term plasticity (STP) to long-term plasticity (LTP), demonstrating precise optical control over synaptic strength (Supplementary Fig. 5). The observed reversible synaptic potentiation-depression transitions under combined optical and electrical stimulation further confirm the critical role of optoelectronic coupling in synaptic weight modulation (Supplementary Fig. 6). Remarkably, these devices achieve ultra-low energy consumption of 8 fJ per spike for electrical read operations, surpassing most reported organic and vdW-based phototransistors (Supplementary Fig. 7). Device-to-device variation characterization across approximately 20 InSe-based units reveals reproducibility in photodoping behavior and synaptic responses, with a variation of 9.5%, confirming the reliability and uniformity of these devices for neuromorphic applications (Fig. 2c and Supplementary Fig. 8).

Typical STDP feature in vdW phototransistors can be described by the synaptic weight change (Δw). Figure 2d exhibits a classical asymmetric Paired-STDP characterized by two synaptic weight curves in the first and third quadrants. Such Δw curves are governed by the time-dependent modulation of device conductance under pulsed stimulation. Taking the formation of Δw in the first quadrant as an example, when a couple of electric-light (E-L) stimuli with intervals (Δt) increasing from 10 to 100 ms are applied, the normalized photoconductance curves exhibit a blue shift tendency (Fig. 2e). Notably, the fitted time constants for positive and negative branches in the classical asymmetric Paired-STDP are 150 ms and 60 ms (Supplementary Note 1), respectively, closely aligning with the millisecond response of human neurons²⁸. Other Paired-STDP behaviors, such as symmetric characteristics located in the first and second quadrants, can also be obtained by tuning the E-L parameters (Supplementary Fig. 9). Interestingly, when a strong light intensity and a positive gate voltage are applied, we observe an atypical STDP behavior differing from typical biological phenomena. As shown in Fig. 2f, Δw evolves oppositely to classic STDP at Δt < 0, with minimal weight change at short time intervals (~10 ms) and increasing at longer intervals. Figure 2g and Supplementary Fig. 10 show the opposite conductance trends at the L-E branch (i.e., the third quadrant for atypical STDP and classical STDP) with different optoelectronic parameters. From the device perspective, such synaptic weight variation is closely related to the controllable regulation of photoconductance through the strong light-electric coupling effect, for which the physical mechanism can be understood in the following section. From a learning rule perspective, this atypical STDP enriches synaptic weight adjustment modalities, offering valuable insights for optimizing weight updating rules of SNN architectures in our work.

Mechanism of the trap-assisted light-electric interactions

The rich STDP behaviors observed in the fabricated vdW phototransistors motivate a systematic investigation of optoelectronic coupling effects on dynamic carrier transport mechanisms. In particular, PPC effect in optoelectronic devices with tunable temporal decay properties has emerged as a critical determinant for achieving programmable synaptic plasticity and dynamic weight modulation of neuromorphic devices (Supplementary Table 4). Given that such PPC behavior typically stems from defect-mediated charge trapping/detrapping processes at heterointerfaces or within intrinsic vdW materials^44,45,46, controlled experiments by employing h-BN isolation layers at the 2D/3D semiconductor/dielectric interface are performed for mechanism clarification. Comparative device characterization reveals that h-BN-incorporated structures exhibit photocurrent responses that precisely track illumination signals with microsecond-scale switching dynamics, effectively suppressing PPC effects and demonstrating conventional photodetector behavior (Supplementary Fig. 11). Such a distinct difference in photoconductivity hints at the dominant role of 2D/3D interfacial defects in manipulating the synaptic plasticity of vdW phototransistors. Dynamic carrier transport characterizations are further conducted by low-frequency noise analysis to visualize the hetero-interfacial charge trapping-detrapping phenomenon (Supplementary Fig. 12). The power spectral density (PSD) of the current fluctuations (S_I) for both devices exhibit an ideal 1/f behavior, indicating the uniform distribution of charge fluctuations. Following the semi-empirical equation of ${S}_{I}$/${I}_{{\mbox{ds}}}^{2}$ = A/f, a critical parameter of noise amplitude (A) can be extracted, which marks the level of the charge fluctuation behaviors. As expected, the device without h-BN layer exhibits a higher A than that of the h-BN-based device, further confirming the critical role of hetero-interfacial defects during the photo-doping process.

Based on the elucidated physical image of the PPC characteristics, the observed rich STDP behaviors can be understood as trap-assisted light-electric (TALE) coupling processes. Taking the STDP feature in the first quadrant (E-L) as an example, the negative electric stimulus results in a charge de-trapping process, which increases the carrier concentration of the vdW channel by lowering its Fermi level (the energy band diagram in Fig. 2h; the following light signal further contributes more free carriers by generating photoexcited electron-hole pairs. Such the TALE cooperating effect on the photoconductivity leads to an exponential decay of Δw as time intervals increase. In contrast, for the STDP feature in the third quadrant (L-E), although the first light signal pulls up the carrier concentration, the following positive electric stimulus induces a depletion effect by raising the E_F and trapping carriers, resulting in a TALE competitive effect (Fig. 2i). Interestingly, when the light intensity is weak, the electric-induced trapping probability is lower, resulting in light-dominant photoconductance and normal STDP behavior in the third quadrant of Fig. 2d; while the light intensity is strong, the electric-induced trapping probability is higher, and the carrier depletion-dominant photoconductivity leads to an abnormal STDP behavior in the third quadrant of Fig. 2f. A relevant detailed carrier transport process is provided in Supplementary Fig. 13. Consequently, the TALE interaction effect on synaptic plasticity can be categorized into TALE cooperation, light-dominant TALE competition, and electric-dominant TALE competition, expanding the design space for engineered synaptic plasticity beyond biology to fulfill neural network learning rules.

Triplet-spike-timing-dependent plasticity in phototransistor

The uncovered TALE coupling behaviors of the phototransistor empower the diverse emulation of STDP behaviors. Given the advantages of Triplet-STDP-based SNN in processing complex spatiotemporal information, we further explore the demonstration of Triplet-STDP by manipulating the conductance change under successive triplet signals. The operation schemes featured with two timing windows for Triplet-STDP are sketched in Fig. 3a, b, and can be assigned as two sets of successive triplet stimuli, i.e., light-electric-light (L-E-L) and electric-light-electric (E-L-E). As a result, the recorded photoconductance changes with good repeatability contain diverse TALE coupling characteristics of two paired-spike events, such as electric-dominated negative conductance and light-dominated positive photoconductance behaviors, which provide ampler spacing for engineering the high-order STDP paradigm (Supplementary Fig. 14). For example, the L-E-L branch consists of a L-E and E-L spiking pair, corresponding to enhanced carrier trapping behavior by lowering the Fermi level after light illumination and subsequent re-dominance of photoconductance effects, respectively. For symmetric timing windows of spiking pairs (|Δt₁| = |Δt₂|), a potentiation STDP characteristic that resembles typical Paired-STDP behavior can be observed (Fig. 3c), suggesting that the light spike plays a dominant role as photoconductance enhancement outweighing the defect trapping-induced conductance reduction under the weak positive electric pulse. While for the E-L-E branch, the successive E-L and L-E spiking events introduce the light-dominated and electric-dominated TALE competition effect on the weight update features, respectively, where the effect of defect-capture carriers under the electric pulse exceeds that of photogenerated carriers under the light pulse.

**Fig. 3: The demonstration of Triplet-STDP based on controllable photoconductance.**

To fully investigate the Triplet-STDP characteristics, we examine the synaptic weight variation depending on a variety of triplet stimulus scenarios with different timing windows. Taking the L-E-L framework as an example, for fixed Δt₁, the evolution of the Δw exhibits a distinct decay with increasing Δt₂, which is derived from the gradual recovery of empty trapping states that can capture more photogenerated carriers (Fig. 3d). Such phenomena can be further described by the exponential function in Fig. 3e, and the higher the Δt₁, the faster the degradation of the Δw. This can be understood as the dominant role of light-dominated-TALE competition effect on weight variation gradually weakening as the timing window of Δt₁ prolongs. In contrast, Δw shows a depression behavior and decreases along with the increasing Δt₂ due to the recombination of the photogenerated carrier which weakens the defect trapping ability during the subsequent electric pulse. Interestingly, we found that the overall evolution of Δw for Triplet-STDP mainly follows the latter timing window, Δt₂, showing a last spike pair-dominant STDP behavior. This scenario aligns with neurobiological studies where synaptic plasticity in the visual cortex is especially sensitive to the timing of the final spike as it tends to override earlier ones^31,47. To emphasize the flexible tunability of triplet-signals in shaping the multi-order STDP behaviors, we systematically summarize experimental Triplet-STDP results across different Δt₁ and Δt₂. As shown in Fig. 3f, all patterns of Δw can be categorized into two classes, potentiation, and depression, according to the corresponding input signals, L-E-L and E-L-E, respectively. Specifically, the Δw patterns present a short-interval prominent trend aligning with the biological high-order STDP learning principles of rapid self-adaptation³¹ and can be precisely located via a couple of timing windows, which provide the fundamental prototype for fully establishing STDP learning rules.

To advance the practical implementation potential of the proposed Triplet-STDP feature in neuromorphic computing systems, we systematically evaluate its robustness and material universality across different phototransistor platforms. First, device-level characterization of eight representative InSe phototransistors confirms reproducible Triplet-STDP behavior under symmetric timing-window stimulation, establishing baseline performance metrics (Supplementary Fig. 15). Building upon the design principle of individual device of trapping-based InSe phototransistor, we further investigate the universality of Triplet-STDP behaviors across typical vdW material systems. For example, molybdenum disulfide (MoS₂), as a widely studied layered semiconductor material known for its scalability and fabrication compatibility^48,49,50, is employed to fabricate a trapping-based MoS₂ phototransistor array by incorporating hetero-dielectric layers to enable controlled charge trapping/detrapping dynamics. As expected, the MoS₂ phototransistors successfully replicate multiple Triplet-STDP learning rules under optoelectronic stimulations with good device uniformity across the entire array, mirroring the key optoelectronic properties observed in InSe phototransistors (Supplementary Figs. 16 and 17). These results confirm the generalizability of our proposed phototransistor device design among typical vdW semiconductors and provide a solid foundation for subsequent array-level integration and system demonstration. On the other hand, it is worth mentioning that while InSe phototransistors offer an ideal individual device platform for feature shaping of Triplet STDP, the material vulnerability to chemical solvents such as acetone during the photolithography process leads to severe degradation of carrier transport properties^51,52, hindering the practical application of InSe phototransistor arrays in the hardware deployment of Triplet-STDP-based SNNs. This limitation has long been acknowledged as a critical challenge in exploiting InSe electronic devices for scalable logic circuits and in-memory computing^53,54. Therefore, considering the good universality of Triplet-STDP behavior in MoS₂ phototransistor arrays, we ultimately adopt the MoS₂ phototransistor array for the subsequent hardware implementation of SNN-based imaging processing applications.

High-order spike-timing-dependent-plasticity learning rules

Triplet-STDP features multiple timing windows to enable recognition of complex temporal patterns and high-order spatiotemporal correlations, which is expected to boost the spatiotemporal information processing capability of SNNs^55,56. According to the typical Paired-STDP, the construction of high-order STDP learning rules, such as the Triplet-STDP ones that contain two timing windows starts with a semi-empirical description:

$${W(\triangle {t}_{1},\triangle {t}_{2})={A}_{3}{{\rm{e}}}}^{{-\frac{|\triangle {t}_{1}|}{{\tau }_{1}}}_{{{\rm{e}}}}-\frac{|\triangle {t}_{2}|}{{\tau }_{2}}}$$

(1)

where triplet spikes (t_pre, t_post, t′_post) affect synapse weight based on their timing difference, Δt₁ = t_post – t_pre and Δt₂ = t′_post – t_post. Here, A is the scaling factor and τ as the time constant plays a critical role in determining both the temporal sensitivity to spike timing and the learning window modulation characteristics of the Triplet-STDP mechanism⁵⁷. To quantitatively assess the uniformity of Triplet-STDP behavior across different vdW phototransistors, we perform systematic characterization and precise extraction of τ values from both device architectures. As evidenced in Supplementary Fig. 18, the extracted time constants exhibit minimal variation across all measured devices, demonstrating exceptional uniformity and robust reproducibility of the STDP characteristics in our fabricated vdW phototransistor arrays.

To fully utilize the device characteristics of vdW optoelectronic devices to boost the computational capability, the experimentally observed typical Triplet-STDP behaviors are considered to reshape the Triplet-STDP learning rules. Specifically, a weight fine-tuning factor of G is introduced into the Triplet-STDP rule, which can enhance the generalization ability of the Triplet-STDP rules (GT-STDP) and provide greater weight adjustment flexibility to accommodate more diverse Δw changes in Fig. 3 (Supplementary Note 1). Also, we construct a Generalized Paired-STDP (GP-STDP) rule for a fair comparison in subsequent image processing applications. It is worth mentioning that, as we surveyed, recent works on emerging neuromorphic devices with biological features generally demonstrated paired or Triplet-STDP functions, whereas the high-order STDP-based network architectures are rarely explored (Supplementary Table 5). Thus, the comprehensive understanding and construction of multi-order STDP learning rules based on the vdW phototransistor-based vision sensor in this work underpin the following implementation of SNN for processing complicated visual tasks.

Hardware implementation of spike-timing-dependent plasticity

High-order STDP learning rules with multi-dimensional information mapping characteristics are anticipated to endow SNNs with enhanced feature extraction capability and computational performance⁵⁸. To validate this aspect, we first construct a three-layer fully connected SNN architecture incorporating multi-order STDP learning rules to perform a hybrid classification using a self-designed alphanumeric dataset (Fig. 4a). The built dataset consists of several confusing pairs for typical networks, which provides a better platform to showcase the precise feature extraction capabilities of Triplet-STDP. Initial benchmarking on standard MNIST handwritten digit classification reveals comparable accuracy (>90%) between Paired-STDP and Triplet-STDP networks after 1000 training cycles (Fig. 4b and Supplementary Fig. 19), suggesting the simplicity of MNIST task fails to fully exploit the advanced temporal processing capabilities of Triplet-STDP. However, introducing GT-STDP modulation with <10% tuning factor significantly accelerated convergence, achieving >90% accuracy after 500 iterations and 98.8% after 1000 iterations (Fig. 4c), demonstrating enhanced learning efficiency. The true advantage of Triplet-STDP emerges in complex hybrid classification tasks, where it substantially outperforms Paired-STDP, particularly for challenging character pairs like 1/I and 8/B (Fig. 4d and Supplementary Table 6). This performance gap, deliberately accentuated through controlled network scaling to isolate learning rule effects, underscores the enhanced spatiotemporal feature extraction capacity of Triplet-STDP for demanding pattern recognition applications.

**Fig. 4: High-order STDP learning rules-based spiking neural networks (SNNs) for image classification tasks.**

Transitioning from simulation to hardware implementation, we develop a MoS₂ phototransistor array-based neuromorphic system (Fig. 5a, b), exhibiting device uniformity (device-to-device variation of 4.89% for threshold voltage and 9.76% for On/Off ratio) (Fig. 5c). The hardware architecture integrates a 6×6 phototransistor array with peripheral circuits for performing in-memory computing, system control, and current-mode inference operations (Supplementary Fig. 20). More details can be found in the Method section. Initially, the trained weights are precisely mapped into each array cell via voltage-bias pulse writing, establishing the target conductance states validated by the conductance distribution profiles (Supplementary Fig. 21). During the inference procedure, the selected hybrid patterns are encoded into six asynchronous voltage spike trains as input signals and processed via parallel analog current summation, signal amplification, and conversion to ensure the effective current output (Fig. 5d). By collecting these currents, image classification is implemented successfully where the current range matches the desired ones aligning with the input image (Fig. 5e). As a result, the confusing pattern pairs can be well recognized with less than a 10% loss in accuracy compared to simulations (Fig. 5f), confirming the fidelity of hardware-implemented Triplet-STDP for feature extraction within the phototransistor array-based architecture.

**Fig. 5: Hardware implementation of the hybrid image classification based on vdW phototransistor array.**

Spiking neural network mimicking dynamic vehicle tracking

The event-driven characteristics of SNNs make them promising for developing energy-efficient machine vision systems^59,60,61. As evidenced by recent advances summarized in Supplementary Table 7, SNN architectures employing diverse learning rules have demonstrated progressive evolution from static image analysis to sophisticated dynamic vision processing, particularly in addressing critical automotive challenges such as real-time object detection and tracking for autonomous driving applications. Vehicle tracking, as one of the most prevalent applications in artificial intelligence-enhanced traffic management systems, imposes a stricter requirement for timely information processing. However, current deep recognition models often struggle with insufficient feature capture, inadequate multi-dimensional processing, and dynamic target detection^4,62. Integrating Triplet-STDP with present SNNs potentially offers a solution by enhancing feature extraction ability in complex visual scenes.

To render high-performance vehicle tracking, we construct a sophisticated YOLO-SNN architecture featuring a Triplet-STDP learning rule enabled by the vdW phototransistor vision sensor devices and a self-built vehicle dataset featuring numerous vehicle occlusion samples (Fig. 6a). Figure 6b depicts how the Triplet-STDP governs feature extraction via manipulating the weight update process. Relying on the interrelationships of multiple events in Triplet-STDP, when two neurons exhibit strong spatiotemporal correlation, the weight update process during the model training would be further promoted and strengthened to ensure highly efficient and task-specific feature extraction characteristics, and vice versa. Figure 6c and Fig. 6d visualize the impact of Triplet-STDP on the weight updating process by mapping the training weight in four typical convolutional kernels of the first layer (3 × 3, 16) before and after the STDP layer. Such a significant weight variation clarifies the enhanced feature extraction capability of Triplet-STDP. The enhanced vehicle tracking capability of Triplet-STDP-based SNN can be visualized in Fig. 6e, f. Notably, when the vehicles are overlapped, the network without the Triplet-STDP rule can only track a single target; although partial key features are shielded, the Triplet-STDP-equipped network can still successfully track the overlapped targets. Importantly, such precise tracking is maintained even after the two cars separate in the Triplet-STDP-equipped network; whereas the network lacking the Triplet-STDP rule assigns the wrong identification number. This mismatch causes tracking errors where the network might lose or misidentify targets, reducing overall tracking accuracy and reliability. As a result, the incorporation of the Triplet-STDP layer yields an improvement in detection performance, achieving 90.44% accuracy (Fig. 6g). This represents a significant 14.12% enhancement over the baseline YOLO-SNN architecture (76.32%) and a 5.71% improvement compared to the Paired-STDP variant (84.73%). These quantitative results indicate the boosted feature extraction capabilities enabled by Triplet-STDP’s high-order temporal processing mechanisms of SNNs.

**Fig. 6: Vehicle tracking application based on the Triplet-STDP-enabled SNN.**

Discussion

In this work, we demonstrate the multi-order STDP characteristics of the vdW phototransistor-based vision sensor devices by manipulating the light-electric interaction behaviors, which are desirable for the development of dynamic perception technology. Through control of light-electric interaction effects, we achieve both paired and Triplet-STDP behaviors, with underlying mechanisms revealed through in-depth analysis of dynamic carrier transport properties. The high-order Triplet-STDP learning rule, with its expanded information mapping dimensions, endows the SNNs with advanced feature extraction capabilities. Importantly, such the high-order STDP characteristics demonstrate universality and uniformity across the vdW phototransistor array, exhibiting minimal device-to-device variation of less than 10%. This consistency enables the Triplet-STDP-enabled SNN to achieve significant performance enhancements during hardware implementation on a 6 × 6 vdW phototransistor array for challenging hybrid object recognition tasks, with a classification accuracy loss within 10% between training and inference phases. Furthermore, Triplet-STDP-endowed enhanced spatiotemporal learning capabilities enable high-accuracy vehicle tracking (90.44%), even under complex overlapping conditions. These findings provide a strategic approach for leveraging rich physical device characteristics to implement high-order learning rules, paving the way for high-performance networks and advancing vision perception technologies.

Methods

Device fabrication

The layered InSe flakes are exfoliated from the InSe bulk crystal onto a PDMS film using a mechanical lift-off method and then transferred onto a Si substrate with 300-nm-thick SiO₂ using a precision transfer platform. The passivation layer for the channel region is then formed by directly evaporating an indium layer on top of the exfoliated InSe flakes with 5-nm thickness, endowing a modulated n-type doping behavior for the InSe devices (Supplementary Fig. 22). This doping effect can significantly improve channel mobility⁶³, resulting in lower operating voltage and reduced power consumption of our synaptic device. Subsequently, an aligned copper grid shadow mask is placed on the channel material and Au contacts (50-nm thick) are deposited as the source and drain electrodes using a thermal evaporation machine. As for the fabrication of InSe phototransistors with a h-BN layer, after transferring the channel material InSe, an additional h-BN layer is aligned and transferred on top of the InSe.

To fabricate the MoS₂ phototransistor array, Pd bottom electrodes are first deposited on the substrate using electron-beam evaporation, where photolithography and lift-off define the electrode patterns. A high-k HfO₂ dielectric layer with a thickness of 30 nm is then deposited via atomic layer deposition (ALD). Next, a thin seeding layer with 2 nm is introduced as a charge-trapping layer, followed by the deposition of a second HfO₂ layer of 8 nm, forming a multi-layer stacked gate dielectric structure. A CVD-grown MoS₂ film (from Six-Carbon Technology) is then transferred onto the designated areas and patterned into channels using photolithography and reactive ion etching. Finally, Cr/Au (10/50 nm) source and drain electrodes are defined by a photolithography procedure, deposited via electron-beam evaporation, and structured using a lift-off process.

Structural analysis and electrical characterization

The morphology of the device is characterized by a transmission electron microscope (TEM; JEOL-JEM2100F operated at 200 kV), equipped with an Energy-dispersive X-ray spectroscopy (EDS) system. Raman spectrum is obtained by Raman spectroscopy (LabRam HR-800, Jobin Yvon, λ: 532 nm). Electrical characterization of the vdW phototransistors is performed in a probe station equipped with a semiconductor parameter analyzer (Keysight B1500A). Source Measurement Units (SMUs) and Waveform Generator/Fast Measurement Units (Keysight B1530A) are built into the analyzer to meet experimental testing needs. Optoelectrical properties of the devices are characterized via a self-made precisely tunable power laser system, which is combined with a Keysight DSOX2024a digital storage oscilloscope to modulate the input light peak. The 650 nm laser wavelength is selected for optimal matching with the InSe absorption spectrum and enhanced photogenerated carrier efficiency in our devices (Supplementary Fig. 23). To prevent unnecessary fluctuations caused by the environment, all the electrical measurements are performed in a vacuum (<10⁻⁵ Torr) under darkness. LFN is investigated using a Programmable Point-Probe Noise Measuring System (3PNMS) with a noise floor of 10^–27 A² Hz⁻¹.

To evaluate device uniformity, a systematic characterization methodology is employed. For assessing the uniformity of individual InSe phototransistors, 20 devices are fabricated across multiple batches under strictly controlled process parameters, including material thickness and electrode deposition procedures. The electrical characteristics of each device are recorded on a typical semiconductor parameter analyzer (Keysight B1500A) and statistically analyzed. Similarly, for array-level uniformity, all 36 devices in a 6 × 6 MoS₂ phototransistor array are independently characterized using a row–column addressing method, enabling accurate evaluation of device-to-device variation within the array through a full-coverage measurement strategy.

The 6 × 6 transistor array is electrically characterized and validated for multiply-accumulate (MAC) operations using an NI-PXI 4163 modular instrumentation system integrated with a 24-channel PXI-2532B multiplexer for row/column addressing. Each transistor unit is selectively activated through a row–column addressing protocol by applying a bias voltage to the target row via a PXI-4132 SMU, with unselected rows and columns grounded to minimize crosstalk. For MAC validation, synaptic weights are programmed into conductance states by applying pulsed signals, and analog computing is emulated by inputting voltage vectors to rows while measuring the summed output currents across columns.

Training of spiking neural networks for image classification

We construct a directly trained SNN architecture consisting of three layers in the image classification task. The first layer is the input layer, which includes several neurons equivalent to the size of the training dataset, with each image pixel corresponding to one neuron. This layer converts input pixels into spike sequences through Poisson encoding. The second layer is the processing layer, containing a variable number of excitatory neurons paired with an equal number of inhibitory neurons. Through pre-simulation results, we determine that the network performs optimally when the neuron count is set to 650 and the number of epochs is 3 (based on pre-simulation results in Supplementary Fig. 24). When the firing rate reaches the neuron’s voltage threshold, the excitatory neuron sends an excitatory signal. Simultaneously, the inhibitory neuron connects one-to-one with the excitatory neuron receives the corresponding signal and provides lateral inhibition, meaning each excitatory neuron, except the one sending the excitatory signal, is inhibited by its connected inhibitory neuron. This setup fosters competition among the excitatory neurons. The third layer is the output layer, which outputs the classification support obtained after training each image.

The directly trained SNN is trained on the MNIST handwritten digit dataset and a self-defined alphanumeric dataset. The MNIST handwritten digit dataset is a standard benchmark in image recognition that includes 60,000 training samples and 10,000 testing samples, with each 28 × 28-pixel grayscale image representing digits from 0 to 9. To evaluate the robustness of our proposed model, we also incorporate a self-defined alphanumeric dataset containing the digits 0–9 and both uppercase and lowercase letters A–Z. The network model demonstrates strong classification performance on both the widely used MNIST dataset and our self-defined alphanumeric dataset. It should be clarified that recognition accuracy is influenced not only by the learning rule but also by network size and training data. In general, larger network scales tend to diminish the observable differences between learning rules. To better isolate and evaluate the impact of the learning rule on feature extraction, we deliberately select a smaller network configuration that could highlight the contrast between Paired-STDP and Triplet-STDP learning rules, even at the cost of reduced overall accuracy.

Hardware implementation of the image classification

To conduct the hardware implementation of image recognition, a hardware computing system integrating a vdW phototransistor array, peripheral electronics (INA226A ADC, DAC7311 DAC), and a microcontroller unit (MCU, STM32F103RCT6) is developed. The phototransistor array is configured in a matrix layout and mounted onto a printed circuit board (PCB) to enable multi-channel measurements. First, the synaptic weights derive from the trained Triplet-STDP-based SNN are deployed into the vdW phototransistor array, allowing each device to emulate a specific synaptic weight through its programmable conductance state. During inference, input images (28 × 28-pixel grayscale frames) are pre-processed by downsampling them to 6 × 6-pixel resolution via lossy compression to match the 6-channel input configuration of the vdW phototransistor array. The resized images are then encoded into programmable voltage signals corresponding to pixel intensities and applied as array inputs. Processing involved parallel analog current summation, signal amplification, and conversion to ensure effective output current. By collecting these currents, image classification is implemented successfully where the current range matched the desired ones aligning with the input image.

YOLO-spike-neural-network simulation for vehicle tracking

In the vehicle tracking task, we develop a YOLO-SNN architecture composed of an array of convolutional layers, batch normalization layers, activation functions, YOLO layers, feature concatenation layers, and a deconvolution layer. The architecture began with convolutional layers that performed spatial downsampling for feature extraction. The batch normalization layers following the convolutional layers normalize the features, aiding in accelerated training and model stabilization. The model uses ReLU as the activation function, introducing non-linear characteristics to the model. After sequential convolutions and normalizations, the feature maps are processed through feature concatenation layers and YOLO layers, which produces the target bounding boxes and classification results. To improve the spatial resolution of the feature maps, the network model incorporates a deconvolution layer to reduce the number of channels, followed by further convolutions. To ensure more accurate feature representation, the weights of the convolutional layers are updated using a local hardware-friendly STDP learning rule.

We train the proposed YOLO-SNN architecture using a self-defined vehicle dataset, where each image is initially sized at 224 × 224 pixels and encompasses 75 different vehicle categories. During preprocessing, images are resized to 416 × 416 pixels, followed by image augmentation. The network utilizes an SGD optimizer for parameter updates, with convolution primarily employing 3 × 3 kernels. Batch normalization parameters are set with an epsilon of 0.0001 and momentum at 0.03 to control the precision and speed of normalization updates. The YOLO model detected features at a scale of 75, with deconvolution layers having 128 channels and using 3 × 3 convolution kernels. This setup optimizes the network’s ability to effectively detect and identify a variety of vehicles.

Data availability

The Source data underlying the figures of this study are available at https://figshare.com/s/52312f1f3dfa324585a7. The data that support the findings of this study are available from the corresponding author upon request. All photos in Fig. 6 were taken by the authors and do not involve any third-party copyright.

Code availability

The codes used for simulation and data plotting are available from the corresponding authors upon request.

References

Yang, H. F., Cai, J., Liu, C., Ke, R. & Wang, Y. Cooperative multi-camera vehicle tracking and traffic surveillance with edge artificial intelligence and representation learning. Transp. Res. Part C. Emerg. Technol. 148, 103982 (2023).
Article Google Scholar
Hsieh, J. W., Yu, S. H., Chen, Y. S. & Hu, W. F. Automatic traffic surveillance system for vehicle tracking and classification. IEEE Trans. Intell. Transp. Syst. 7, 175–187 (2006).
Article Google Scholar
Chen, J. et al. Optoelectronic graded neurons for bioinspired in-sensor motion perception. Nat. Nanotechnol. 18, 882–888 (2023).
Article ADS CAS PubMed Google Scholar
Zhang, Z. et al. All-in-one two-dimensional retinomorphic hardware device for motion detection and recognition. Nat. Nanotechnol. 17, 27–32 (2022).
Article ADS PubMed Google Scholar
Huang, P. Y. et al. Neuro-inspired optical sensor array for high-accuracy static image recognition and dynamic trace extraction. Nat. Commun. 14, 6736 (2023).
Article ADS CAS PubMed PubMed Central Google Scholar
Yang, Z. et al. A vision chip with complementary pathways for open-world sensing. Nature 629, 1027 (2024).
Article ADS CAS PubMed Google Scholar
Dang, B. et al. Reconfigurable in-sensor processing based on a multi-phototransistor–one-memristor array. Nat. Electron. 7, 991–1003 (2024).
Article Google Scholar
Wu, G. et al. Ferroelectric-defined reconfigurable homojunctions for in-memory sensing and computing. Nat. Mater. 22, 1499–1506 (2023).
Article ADS CAS PubMed Google Scholar
Yao, P. et al. Fully hardware-implemented memristor convolutional neural network. Nature 577, 641–646 (2020).
Article ADS CAS PubMed Google Scholar
Ghosh-Dastidar, S. & Adeli, H. Spiking neural networks. Int. J. Neural Syst. 19, 295–308 (2009).
Article PubMed Google Scholar
Tavanaei, A., Ghodrati, M., Kheradpisheh, S. R., Masquelier, T. & Maida, A. Deep learning in spiking neural networks. Neural Netw. 111, 47–63 (2019).
Article PubMed Google Scholar
Kheradpisheh, S. R., Ganjtabesh, M., Thorpe, S. J. & Masquelier, T. STDP-based spiking deep convolutional neural networks for object recognition. Neural Netw. 99, 56–67 (2018).
Article PubMed Google Scholar
Zhang, W. et al. Neuro-inspired computing chips. Nat. Electron. 3, 371–382 (2020).
Article ADS Google Scholar
Zhou, Y. et al. Computational event-driven vision sensors for in-sensor spiking neural networks. Nat. Electron. 6, 870–878 (2023).
Article Google Scholar
Cassenaer, S. & Laurent, G. Hebbian STDP in mushroom bodies facilitates the synchronous flow of olfactory information in locusts. Nature 448, 709–713 (2007).
Article ADS CAS PubMed Google Scholar
Woźniak, S., Pantazi, A., Bohnstingl, T. & Eleftheriou, E. Deep learning incorporating biologically inspired neural dynamics and in-memory computing. Nat. Mach. Intell. 2, 325–336 (2020).
Article Google Scholar
Marković, D., Mizrahi, A., Querlioz, D. & Grollier, J. Physics for neuromorphic computing. Nat. Rev. Phys. 2, 499–510 (2020).
Article Google Scholar
Leong, J.-F. et al. N-P Reconfigurable dual-mode memtransistors for compact bio-inspired feature extractor with inhibitory-excitatory spiking capability. Adv. Funct. Mater. 33, 2302949 (2023).
Article CAS Google Scholar
John, R.-A. et al. Synergistic gating of electro-iono-photoactive 2D chalcogenide neuristors: coexistence of Hebbian and homeostatic synaptic metaplasticity. Adv. Mater. 30, 1800220 (2018).
Article Google Scholar
Yang, R. et al. Synaptic suppression triplet-STDP learning rule realized in second-order memristors. Adv. Funct. Mater. 28, 1704455 (2018).
Article Google Scholar
Wang, Z. et al. Toward a generalized Bienenstock-Cooper-Munro rule for spatiotemporal learning via triplet-STDP in memristive devices. Nat. Commun. 11, 1510 (2020).
Article ADS CAS PubMed PubMed Central Google Scholar
Diehl, P. U. & Cook, M. Unsupervised learning of digit recognition using spike-timing-dependent plasticity. Front. Comput. Neurosci. 9, 99 (2015).
Article PubMed PubMed Central Google Scholar
Babadi, B. & Abbott, L. F. Stability and competition in multi-spike models of spike-timing dependent plasticity. PLoS Comput. Biol. 12, e1004750 (2016).
Article ADS PubMed PubMed Central Google Scholar
Zhai, Y. et al. 2D heterostructure for high-order spatiotemporal information processing. Adv. Funct. Mater. 32, 2108440 (2022).
Article CAS Google Scholar
Clopath, C., Büsing, L., Vasilaki, E. & Gerstner, W. Connectivity reflects coding: a model of voltage-based STDP with homeostasis. Nat. Neurosci. 13, 344–352 (2010).
Article CAS PubMed Google Scholar
Wang, Z. et al. An electrolyte-gated InGaZnO phototransistor that emulates visual experience-dependent plasticity. Adv. Electron. Mater. 11, 2400612 (2025).
Article CAS Google Scholar
Song, X. Physical education teaching mode assisted by artificial intelligence assistant under the guidance of high-order complex network. Sci. Rep. 14, 4104 (2024).
Article ADS CAS PubMed PubMed Central Google Scholar
Song, S., Miller, K. D. & Abbott, L. F. Competitive Hebbian learning through spike-timing-dependent synaptic plasticity. Nat. Neurosci. 3, 919–926 (2000).
Article CAS PubMed Google Scholar
Morrison, A., Aertsen, A. & Diesmann, M. Spike-timing-dependent plasticity in balanced random networks. Neural Comput. 19, 1437–1467 (2007).
Article MathSciNet PubMed Google Scholar
Ni, Z. Hybrid structure of silicon nanocrystals and 2D WSe₂ for broadband optoelectronic synaptic devices. In 2018 IEEE International Electron Devices Meeting (IEDM) 38-5 (IEEE, 2018).
Wang, H. X., Gerkin, R. C., Nauen, D. W. & Bi, G. Q. Coactivation and timing-dependent integration of synaptic potentiation and depression. Nat. Neurosci. 8, 187–193 (2005).
Article CAS PubMed Google Scholar
Gerstner, W., Lehmann, M., Liakoni, V., Corneil, D. & Brea, J. Eligibility traces and plasticity on behavioral time scales: experimental support of neohebbian three-factor learning rules. Front. Neural Circuit 12, 53 (2018).
Article Google Scholar
Sjöström, P. J., Turrigiano, G. G. & Nelson, S. B. Rate, timing, and cooperativity jointly determine cortical synaptic plasticity. Neuron 32, 1149–1164 (2001).
Article PubMed Google Scholar
Li, Y. et al. Oxide-based electrolyte-gated transistors for spatiotemporal information processing. Adv. Mater. 32, 2003018 (2020).
Article CAS Google Scholar
Wang, D. et al. Spintronic leaky-integrate-fire spiking neurons with self-reset and winner-takes-all for neuromorphic computing. Nat. Commun. 14, 1068 (2023).
Article ADS CAS PubMed PubMed Central Google Scholar
Zhou, F. et al. Optoelectronic resistive random access memory for neuromorphic vision sensors. Nat. Nanotechnol. 14, 776–782 (2019).
Article ADS CAS PubMed Google Scholar
Liu, X. et al. An optoelectronic synapse based on two-dimensional violet phosphorus heterostructure. Adv. Sci. 10, 2301851 (2023).
Article ADS CAS Google Scholar
Xie, D. et al. Polarization-perceptual anisotropic two-dimensional ReS₂ neuro-transistor with reconfigurable neuromorphic vision. Mater. Horiz. 9, 1448–1459 (2022).
Article CAS PubMed Google Scholar
Liu, C. et al. Two-dimensional materials for next-generation computing technologies. Nat. Nanotechnol. 15, 545–557 (2020).
Article ADS CAS PubMed Google Scholar
Chen, K. et al. Organic optoelectronic synapse based on photon-modulated electrochemical doping. Nat. Photonics 17, 629–637 (2023).
Article ADS CAS Google Scholar
Pasquale, G. et al. Spin polarization detection via chirality-induced tunnelling currents in indium selenide. Nat. Mater. 24, 212–218 (2025).
Article ADS CAS PubMed PubMed Central Google Scholar
Liu, J. et al. Selectively modulated photoresponse in type-I heterojunction for ultrasensitive self-powered photodetectors. Laser Photonics Rev. 16, 2200338 (2022).
Article ADS CAS Google Scholar
Pasquale, G. et al. Electrically tunable giant Nernst effect in two-dimensional van der Waals heterostructures. Nat. Nanotechnol. 19, 941–947 (2024).
Article ADS CAS PubMed PubMed Central Google Scholar
Yao, J. et al. Ultra-low power carbon nanotube/porphyrin synaptic arrays for persistent photoconductivity and neuromorphic computing. Nat. Commun. 15, 6147 (2024).
Article ADS CAS PubMed PubMed Central Google Scholar
Xie, P. et al. Birdlike broadband neuromorphic visual sensor arrays for fusion imaging. Nat. Commun. 15, 8298 (2024).
Article ADS CAS PubMed PubMed Central Google Scholar
Wu, G. et al. Miniaturized spectrometer with intrinsic long-term image memory. Nat. Commun. 15, 676 (2024).
Article ADS CAS PubMed PubMed Central Google Scholar
Froemke, R. C. et al. Contribution of individual spikes in burst-induced long-term synaptic modification. J. Neurophysiol. 95, 1620–1629 (2006).
Article PubMed Google Scholar
Ghosh, S. et al. A complementary two-dimensional material-based one instruction set computer. Nature 642, 327–335 (2025).
Article ADS CAS PubMed Google Scholar
Ao, M. et al. A RISC-V 32-bit microprocessor based on two-dimensional semiconductos. Nature 640, 654–661 (2025).
Article ADS CAS PubMed Google Scholar
Yu, H. et al. Eight in. wafer-scale epitaxial monolayer MoS₂. Adv. Mater. 36, 2402855 (2024).
Article CAS Google Scholar
Arora, H. & Erbe, A. Recent progress in contact, mobility, and encapsulation engineering of InSe and GaSe. InfoMat 3, 662–693 (2021).
Article CAS Google Scholar
Ho, P.-H. et al. High-mobility InSe transistors: the role of surface oxides. ACS Nano 11, 7362–7370 (2017).
Article CAS PubMed Google Scholar
Xu, H. et al. Critical band-to-band-tunnelling based optoelectronic memory. Light Sci. Appl. 14, 72 (2025).
Article ADS CAS PubMed PubMed Central Google Scholar
Su, S. K. et al. Layered semiconducting 2D materials for future transistor applications. Small Struct. 2, 2000103 (2021).
Article CAS Google Scholar
Ren, Y. et al. Synaptic plasticity in self-powered artificial striate cortex for binocular orientation selectivity. Nat. Commun. 13, 5585 (2022).
Article ADS CAS PubMed PubMed Central Google Scholar
Gjorgjieva, J., Clopath, C., Audet, J. & Pfister, J. P. A triplet spike-timing-dependent plasticity model generalizes the Bienenstock-Cooper-Munro rule to higher-order spatiotemporal correlations. Proc. Nat. Acad. Sci. USA 108, 19383–19388 (2011).
Article ADS CAS PubMed PubMed Central Google Scholar
Montangie, L., Miehl, C. & Gjorgjieva, J. Autonomous emergence of connectivity assemblies via spike triplet interactions. PLOS Comput. Biol. 16, e1007835 (2020).
Article ADS CAS PubMed PubMed Central Google Scholar
Hansen, J.-Y. et al. Mapping neurotransmitter systems to the structural and functional organization of the human neocortex. Nat. Neurosci. 25, 1569–1581 (2022).
Article CAS PubMed PubMed Central Google Scholar
Su, Q. et al. SNN-BERT: Training-efficient Spiking Neural Networks for energy-efficient BERT. Neural Netw. 180, 106630 (2024).
Article PubMed Google Scholar
Li, D. et al. Double-opponent spiking neuron array with orientation selectivity for encoding and spatial-chromatic processing. Sci. Adv. 11, eadt3584 (2025).
Article ADS CAS PubMed PubMed Central Google Scholar
Zhu, Z., Hou, J., & Wu, D. O. Cross-modal orthogonal high-rank augmentation for RGB-event transformer-trackers. In Proceedings of the IEEE/CVF International Conference on Computer Vision 22045–22055 (IEEE, 2023).
Talaei Khoei, T., Ould Slimane, H. & Kaabouch, N. Deep learning: systematic review, models, challenges, and research directions. Neural Comput. Appl. 35, 23103–23124 (2023).
Article Google Scholar
Li, M. J. et al. High mobilities in layered InSe transistors with indium encapsulation-induced surface charge doping. Adv. Mater. 30, 1803690 (2018).
Article ADS Google Scholar

Download references

Acknowledgements

The authors would like to acknowledge the financial support given by the National Key Research and Development (R&D) Program from Ministry of Science and Technology (2024YFA1211500, M.J. Li) and National Natural Science Foundation of China (62304320, M.J. Li). The authors also acknowledge the support from the Taiwan Ministry of Science and Technology (NSTC 112-2628-M-005-001-MY3, Y.-F. Lin).

Author information

These authors contributed equally: Mengjiao Li, Hongling Chu, Caifang Gao.

Authors and Affiliations

School of Microelectronics, Shanghai University, Jiading, Shanghai, China
Mengjiao Li, Hongling Chu, Caifang Gao, Muyun Huang, Lingling Miu, Jun Li, Jingjing Liu & Jianhua Zhang
Shanghai Collaborative Innovation Center of Intelligent Perception Chip Technology, Shanghai, China
Mengjiao Li
Department of Physics, National Chung Hsing University, Taichung, Taiwan
Feng-Shou Yang & Yen-Fu Lin
Graduate Institute of Applied Science and Technology, National Taiwan University of Science and Technology, Taipei, Taiwan
Ching-Hwa Ho

Authors

Mengjiao Li
View author publications
Search author on:PubMed Google Scholar
Hongling Chu
View author publications
Search author on:PubMed Google Scholar
Caifang Gao
View author publications
Search author on:PubMed Google Scholar
Feng-Shou Yang
View author publications
Search author on:PubMed Google Scholar
Muyun Huang
View author publications
Search author on:PubMed Google Scholar
Lingling Miu
View author publications
Search author on:PubMed Google Scholar
Jun Li
View author publications
Search author on:PubMed Google Scholar
Ching-Hwa Ho
View author publications
Search author on:PubMed Google Scholar
Jingjing Liu
View author publications
Search author on:PubMed Google Scholar
Yen-Fu Lin
View author publications
Search author on:PubMed Google Scholar
Jianhua Zhang
View author publications
Search author on:PubMed Google Scholar

Contributions

M.L. and J.L. conceived and designed the experiments. M.L., C.G., and F.-S.Y. fabricated the device and conducted electrical measurements. C.-H.H. provided the InSe crystal. M.H. and L.M. performed the simulation of a spiking neural network. M.L. and H.C. completed the data analysis together. The paper is written by M.L. and H.C. with contributions from all the co-authors. Y.-F.L. and J.Z. supervised the research. J.L. provided some experimental methods. All the authors discussed the results and commented on the manuscript.

Corresponding authors

Correspondence to Jingjing Liu, Yen-Fu Lin or Jianhua Zhang.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Communications thanks Sunkook Kim and the other anonymous reviewer(s) for their contribution to the peer review of this work. A peer review file is available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Transparent Peer Review file

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Li, M., Chu, H., Gao, C. et al. Bioinspired high-order in-sensor spatiotemporal enhancement in van der Waals optoelectronic neuromorphic electronics. Nat Commun 16, 8801 (2025). https://doi.org/10.1038/s41467-025-63873-6

Download citation

Received: 05 February 2025
Accepted: 01 September 2025
Published: 02 October 2025
DOI: https://doi.org/10.1038/s41467-025-63873-6