Introduction

The advent of big data and the Internet of Things (IoT) has transformed smart devices -- ranging from smartphones to autonomous driving systems (ADS) into high-risk targets for information leakage and cyberattacks1,2. These devices, characterized by their wide distribution and diverse functionalities, handle massive amounts of edge data, making them vulnerable to malicious actors3,4,5. ADS, as a prominent example of IoT terminals, requires real-time processing of data from vehicle sensors and communication with other vehicles, infrastructure, and cloud services6,7. However, the traditional von Neumann architecture, which separates storage and computation, necessitates frequent data transfers between storage and processing units. This separation not only increases processing time but also significantly elevates energy consumption8, particularly in edge computing environments where instantaneous data processing is crucial. Inefficient data handling exposes end devices to prolonged vulnerabilities, heightening the risk of cyberattacks and jeopardizing the overall security and reliability of these systems9. To mitigate these challenges, in-memory computing architecture has been developed to fuse memory and computing functions within a single unit, enabling “in-place processing” of data10,11,12.

Memristors have emerged as promising candidates for in-memory computing systems due to their advantageous properties, including high-speed reading/writing, multi-level storage capability, non-volatility, low power consumption, and low latency13,14,15,16. Unlike the traditional silicon devices in the von Neumann architecture, memristors enable the tight integration of data storage and computation through matrix-vector multiplication17, leveraging the physical principles of Ohm’s law and Kirchhoff’s law to facilitate massive parallel computing capabilities18,19,20. However, the mass integration of memristors often encounters the issue of sneak path currents, where unintended current paths lead to misreading and crosstalk between array units, adversely affecting the accuracy of matrix-vector multiplication operations and, consequently, the training and classification performance of neuromorphic computing systems. To address this challenge, various configurations such as transistor-memristor (1T1M)5, diode-memristor (1D1M)21, and selector-memristor (1S1M)22 have been proposed. While these approaches can mitigate data distortion during read operations, they introduce complexities in circuit design and manufacturing, potentially increasing costs and reducing yield23. In contrast, passive crossbar arrays composed of self-rectifying memristors (SRMs) offer a more elegant solution24,25,26,27,28,29. By eliminating the need for additional switching elements, these arrays can prevent misinterpretation caused by crosstalk, thus enhancing power efficiency and scalability while providing a cost-effective solution for neuromorphic computing applications30,31,32. However, a common issue with SRMs at this stage is that the RR is not large enough or RR, NL, and process ease cannot be accommodated simultaneously, which greatly limits the size of the array24,30. Therefore, exploring SRMs with both superior RR, NL, and the required simple process flow can significantly advance the development of larger scale in-memory computing.

In this study, we design a SRM based on a simple TiN/HfOx/Pt structure and extend its application into arrays for real-time classification of autonomous driving databases. We conduct comprehensive physical and chemical characterizations of the device’s thin films. Following rapid thermal annealing (RTA), our SRMs achieve recorded high rectification ratios (RR) of 108 and nonlinearity (NL) of 105, along with minimal device-to-device (D2D) and cycle-to-cycle (C2C) variations. We elucidate the underlying conducting mechanisms and demonstrate the scalability of SRMs in crossbar arrays, achieving a proof-of-concept scalability exceeding 25.4 terabits. Furthermore, we validate the artificial synaptic properties of the SRMs, enabling the construction of hardware-based artificial neural networks for autonomous driving applications. Notably, we assess the performance of SRMs in training and classifying the Berkeley DeepDrive (BDD) dataset, revealing the attack-resilient accuracy of 32 × 32 crossbar arrays based on the proposed SRMs closely approaches that of the YOLOv9 model. This work highlights the potential of SRMs in enhancing the security of ADS and lays the groundwork for their broader application in edge IoT environments.

Results

Thin film characterization and analysis

The resistance state of a memristor is influenced by the history of electrical stimulation applied to the device, analogous to the function of synapses in the human brain. The resistance change in memristors can simulate synaptic weight changes, thereby mimicking the behavior of biological synapses33. A schematic illustrating the synaptic function of the proposed single-layer SRM is presented in Fig. 1a. In biological systems, an input pulse signal from a presynaptic neuron triggers the release of neurotransmitters into the synaptic gap, which then migrate toward the postsynaptic membrane. Postsynaptic neurons integrate information by recognizing the intensity and timing differences between excitatory and inhibitory postsynaptic potentials from various synapses34,35. This interconnected network of neurons forms a large-scale neural network. Similarly, a complete crossbar array of SRMs functions as an artificial synaptic system, where each SRM unit acts as a single synapse23. When stimulated by voltage pulses along the word lines the SRMs yield a current response corresponding to the changes in their resistance state, which is then output along the bit lines34. The process effectively converts digital signals into analog signals.

Fig. 1: Physical and chemical characterization of thin films.
Fig. 1: Physical and chemical characterization of thin films.
Full size image

a Scheme for artificial synapses based on the SRMs with the structure of TiN/HfOx/Pt. b The TEM image of the cross-section of SRMs with the structure of TiN/HfOx/Pt. c EDS mapping images for Pt, Hf, O, Ti. df The XPS spectra of Hf, O without RTA, and O with RTA under 150 °C for 600 s. gj AFM maps of HfOx surfaces for four manufacturing batches.

The cross-sectional transmission electron microscopy (TEM) image of the proposed SRMs is shown in Fig. 1b, while the energy dispersive spectroscopy (EDS) images of platinum (Pt), hafnium (Hf), oxygen (O), nitrogen (N) and titanium (Ti) are presented in Fig. 1c. The thickness of HfOx layer is maintained at 10 nm to ensure adequate insulation while balancing defect concentration. Using this structure, we fabricated the SRMs (Supplementary Fig. 1), with the size of 3 × 3 μm2. Given that HfOx significantly influences the electrical properties of the proposed memristors, it is crucial to analyze its chemical composition. The X-ray photoelectron spectroscopy (XPS) spectrum of the Hf element, shown in Fig. 1d, reveals two spin-orbit peaks corresponding to Hf 4 f7/2 and Hf 4f5/2, indicative of Hf4+ valence states in the pristine HfOx. And the Hf 4f XPS spectrum of HfOx with RTA is illustrated in Supplementary Fig. 2. The XPS spectrum of oxygen in pristine HfOx (Fig. 1e) shows a coexistence of 38.65% lattice oxygen (OL) with the peak at 531.03 eV and 61.35% oxygen vacancies (OV) with the peak at 532.82 eV, confirming a substantial number of defects within the HfOx films produced via magnetron sputtering. Previous studies indicate that a high concentration of OV can hinder the self-rectification effect in memristors due to the formation of continuous conducting filaments36,37. To mitigate this issue, we performed RTA on the HfOx films at 150 °C for 600 s. The XPS spectrum of the oxygen following RTA (Fig. 1f) shows a significant reduction in OV concentration to 18.11%, where the binding energy peak of OL and OV is 530.85 eV and 532.63 eV respectively, demonstrating effective repair of the vacancies. The variation in OV concentration from the surface to a depth of 5 nm in the HfOx films after the RTA process with the same parameters is illustrated in Supplementary Fig. 3. Additionally, to validate the appropriateness of the selected RTA parameters, Supplementary Fig. 4 presents OV concentration data corresponding to various RTA temperatures and durations. Surface roughness is another critical factor influencing memristor performance38. Figure 1g–j displays atomic force microscopy (AFM) images of HfOx films randomly selected from four different fabrication batches, indicating stable and low root-mean-square (RMS) surface roughness. Smooth surfaces reduce contact resistance and optimize charge injection for more stable and repeatable resistive switching, while rough surfaces are prone to uneven electric fields, localized hot spots, and deformation, which disrupt switching uniformity39. In SRMs, moderate roughness guides the uniform formation of continuous conductive paths consisting of oxygen vacancies within the HfOx, enhancing uniformity, while excessive roughness leads to random dispersion of the paths, which worsens the cycle-to-cycle variations. In terms of storage properties, smooth surfaces provide a stable environment for the conductive pathways within the HfOx, prolonging retention time, whereas bad roughness accelerates film degradation, shortening retention and further weakening reliability. Endurance is also regulated by roughness, with optimal parameters enabling stable filament cycling, while excessive roughness triggers conductive filament destabilization, dramatically reducing the number of reliable switching39. The surface profile data is presented in Supplementary Fig. 5, further confirming that the thin-film preparation process employed in this work ensures high-quality film formation of HfOx.

Electrical characterization and scalability simulation

After thoroughly characterizing the HfOx films from both physical and chemical perspectives, we proceeded to evaluate the electrical properties of the SRMs.The direct current (DC) current-voltage (I-V) curves of the SRMs and their wiring configuration during testing are illustrated in Fig. 2a. In this setup, the Pt bottom electrode (BE) is biased while the TiN top electrode (TE) is grounded. The choice of operating voltage scheme critically influences the performance of the SRMs; we selected the 1/6 V scheme (Supplementary Fig. 7) to optimize RR and NL27. RR is defined as the ratio of the low-resistance state (LRS) current at the positive read voltage to that at 2/3 negative read voltage (IV/I-2/3V), while NL is defined as the ratio of the LRS current at the positive read voltage to that at the read voltage of a partially selected cell (IV/I1/6V) (Supplementary Fig. 7)26. The proposed single-layer SRMs achieve a recorded RR of 108 and NL of 105 at the 1/6 V scheme, with a negative leakage current suppressed below 0.1 pA. Additional trends of RR and NL at varying operating voltages can be found in Supplementary Fig. 8. To assess the consistency among devices within the fabricated 32 × 32 array, we present the DC electrical characteristics of 16 randomly selected SRM units from the array in Supplementary Fig. 10, confirming negligible variations. The significance of the RTA process is highlighted in Supplementary Fig. 13 and 14, which shows the I-V curves of TiN/HfOx/Pt memristors without and with the RTA process at different parameters of RTA, demonstrating these parameters (150 °C, 600 s) are optimal for achieving self-rectification. Critically, the absence of self-rectification in non-RTA devices aligns with their filament-dominated conduction, as evidenced by the low-resistance Ohmic I-V profile. After the SET process, the conductive filament-based memristor is in the LRS. However, the robust conductive filaments lead to a negative RESET process that needs to undergo a scanning current with a very high peak value, which in turn makes the self-rectification effect non-existent and thus poorly scalable40. In contrast, RTA-treated devices exhibit nonlinear, rectifying behavior, which correlates with suppressed filament formation. This transition is further supported by the quantified OV concentration gradient derived from XPS depth profiling. When the RTA temperature or time is insufficient, the OV distribution inside the HfOx is uniform, and the conducting filaments can be easily formed and thus realize the abrupt SET40. Figure 2b illustrates the response of the SRMs following 86 consecutive positive voltage sweeps from 0 to 2 V, revealing a gradual increase in response current corresponding to a continuous increase in conductance. This consistent change in conductance effectively mimics the dynamics of synaptic weights, affirming the device’s capability to replicate potentiation behaviors observed in biological synapses. The RR endurance performance of the SRMs is characterized in Fig. 2c, where repeated programming with alternating positive pulses (2 V/1 ms) and negative pulses(−2 V/1 ms) results in minor fluctuations in response current, indicating excellent stability in conductance and RR over more than 107 programming cycles. In addition, we tested the endurance characteristics of the on/off ratio of this SRM (Supplementary Fig. 15), demonstrating the reliability of the proposed single-layer SRMs for long-term repeated programming. Moreover, we tested the device for data retention and obtained the conductance decay curves shown in Supplementary Fig. 16.

Fig. 2: Electrical properties and scalability of the proposed SRMs.
Fig. 2: Electrical properties and scalability of the proposed SRMs.
Full size image

a I-V curves of TiN/HfOx/Pt SRMs based on 1/6 V scheme with the RR of ~108 and the NL of 105. b Dynamic I-V curves obtained from the proposed SRMs with continuous unidirectional sweeping. c RR endurance properties of the proposed SRMs over 107 programming cycles. The conducting mechanisms of the proposed SRMs when d negative bias and e positive bias is added respectively. f A comparison chart of this work with advanced SRMs compatible with CMOS technology on two important metrics: RR and NL. g Schematic diagram of sneak path currents in the crossbar array without self-rectifying. h The scalability simulation of the proposed SRMs with over 25.4 Tb scale. i The trend chart of array size with DC operating voltage.

We further elucidate the conducting mechanism responsible for the ultra-high RR of the SRMs. The selection of TiN and Pt as TE and BE, respectively, is based on their work functions of ~4.5 eV and 5.6 eV41,42, leading to a significant asymmetry in the energy band structure (Supplementary Fig. 17). After the RTA process, a substantial number of OV in the HfOx layer near the TiN/HfOx interface are repaired, while a similar concentration remains at the Pt/HfOx interface. Under negative bias (Fig. 2d), OV migrates in the direction of the electric field, accumulating near the Pt/HfOx interface and resulting in an unbalanced OV distribution. As negative voltage increases, electrons from the BE can’t undergo Schottky emission on account of the large barrier at Pt/HfOx interface, resulting in negative rectification43. Conversely, when a positive bias is applied (Fig. 2e), OV migrates toward the TiN/HfOx interface, leading to a more uniform distribution and the formation of a continuous electron-conducting pathway. As the positive bias increases, electrons undergo both Schottky and Poole-Frenkel emissions, facilitating significant current response. Under positive bias, the HRS is characterized by sparse, non-uniform oxygen vacancy clusters forming discontinuous percolation paths before SET, while the LRS emerges through the formation of continuous, dense, and homogeneous conductive paths driven by field-enhanced vacancy migration after SET (Supplementary Fig. 18). The simulation results shown in Supplementary Fig. 19 confirm the migration as well as the distribution of OV in HfOx with the electric field as previously described, and the detailed scheme is described in the “Methods” Section. To further investigate the impact of TE materials, we also utilized indium tin oxide and tungsten (W), which have similar work functions to TiN44,45. The resulting I-V curves (Supplementary Fig. 20) indicate the DC properties of the proposed SRMs are not directly influenced by the chemical composition of the TEs. The electron affinity of HfOx and the metal’s work function are the key determinants of the metal/oxide Schottky barrier. Supplementary Fig. 21 demonstrates the electron affinity of the upper surface of the HfOx without RTA as obtained by Low Energy Inverse Photoelectron Spectroscopy (LEIPS) characterization46. The value is 3.542 eV in the absence of an applied electric field, which is closer to the figure of merit of 4.5 eV for TiN. And when positive bias is applied, as mentioned before, the Schottky barrier of TiN/HfOx decreases significantly due to the increased electron affinity of that side of the HfOx, resulting in significant Schottky emission at low voltage with guaranteeing the large enough barrier of Pt/HfOx. Furthermore, in order to verify the important influence of Schottky emission and P-F emission on the conductive mechanism of the proposed SRM, we also tested the DC I-V curves at different high temperatures and fitted them linearly (Supplementary Fig. 22). Further, based on the data at high temperatures, the variation of the Schottky barrier at the TiN/HfOx interface with temperature is illustrated in Supplementary Fig. 22, which is 0.16 eV at low temperatures and 0.56 eV at high temperatures47. Notably, the complementary metal-oxide semiconductor (CMOS)-compatible SRMs demonstrated superior performance (Fig. 2f), achieving optimal RR and NL values with scalability to date24,25,26,27,28,29,30,31,32,48. The single-layer structure of the proposed SRMs is simpler and more straightforward to fabricate than conventional two-layer or multilayer designs, and HfOx is compatible with CMOS technology, enabling potential large-scale integration on silicon-based chips49,50.

Based on the single-layer SRMs, we constructed 32 × 32 crossbar arrays to study their scalability and explore their feasibility as a hardware platform for neuromorphic computation. Figure 2g presents a schematic of the sneak paths in localized 3 × 3 sections of the 32 × 32 crossbar arrays. Ideally, only the selected cell should conduct current, while unselected cells remain inactive. The equivalent circuit diagram of the sneak path in the passive crossbar array is shown in Supplementary Fig. 23. Inter-cell misreading occurs only when current flows through the selected cell. However, both SRMs and memristors without self-rectification exhibit pronounced inter-device crosstalk when scaled to larger array sizes51. We employed a pull-up voltage strategy (detailed in Supplementary Note 1) to calculate the read margin and simulate the maximum scalability achievable with the proposed SRMs51. Increasing the read margin allows for a larger operating voltage window, improving the distinction between the LRS and high resistance state (HRS) of the targeted cells during the read operations. However, the read margin typically decreases as the crossbar array size increases. It is generally accepted that when the read margins drop below 10% in the worst case when all unselected cells are incorrectly activated or read, the maximum realizable array size is reached30. As shown in Fig. 2h, the single-layer SRMs achieve scalability exceeding 25.4 Tb at a read voltage of 1.5 V, representing state-of-the-art performance (Supplementary Table 1). We also examined the factors affecting the scalability of the proposed SRMs at different read voltages. Results in Fig. 2i indicate that read voltage below 1.5 V results in poor RR and NL (as shown in Supplementary Fig. 8), leading to limited scalability. Conversely, voltages larger than 1.5 V yield a poor on/off ratio in the I-V curve, similarly constraining scalability. Thus, the combination of scalability, voltage magnitude, and reliability establishes 1.5 V as the optimal read voltage, ensuring scalability without compromise.

Characterization of artificial synaptic properties

As shown in Fig. 2b, the proposed SRMs exhibit a weight potentiation analogous to that of biological synapses. In this section, we demonstrate weight updating through electrical pulse testing. We first applied a positive writing pulse and a negative RESET pulse to a single SRM within the 32 × 32 array, ensuring both pulses had the same width, amplitude, and adjustable intervals. Figure 3a depicts the applied electrical pulse (blue) alongside the corresponding current response (red). Under the positive writing pulse, the device generates a peak current of about 1.3 μA. In contrast, the negative RESET pulse elicits minimal current output due to the previously described rectification effect. For subsequent synaptic weight update tests, we sampled the response current at the onset of the falling edge of the positive pulse. Paired pulse facilitation (PPF) is a fundamental characteristic of biological synapses, reflecting the enhanced response to successive stimuli52. PPF serves as an indicator of synaptic plasticity, which is crucial for artificial synapses in simulating learning and memory processes. The variation of the PPF index with pulse interval time is shown in Fig. 3b, with the PPF index defined by Eq. 1:

$${PPF\; index}=\frac{{I}_{2}{-I}_{1}}{{I}_{1}}\times 100\%$$
(1)

where \({I}_{1}\) is the initial peak current response and \({I}_{2}\) is the subsequent peak current response. With a fixed pulse width of 0.2 ms, the PPF index exhibits an exponential decline with increasing interval time, stabilizing at 0 after 200 ms. The equation for PPF versus interval time t is presented by Eq. 2, where t1 and t2 are the decaying coefficient (approximately equal to 7.44 and 67.88, respectively) as well as A1 and A2 are both constant.

$${PPF\; index}={A}_{1}\exp \left(-\frac{t}{{t}_{1}}\right)+{A}_{2}\exp \left(-\frac{t}{{t}_{2}}\right)$$
(2)
Fig. 3: Artificial synaptic properties of the proposed SRMs.
Fig. 3: Artificial synaptic properties of the proposed SRMs.
Full size image

a One write pulse as well as one RESET pulse (blue) and the resulting response current (red). b The variation of PPF with the paired voltage pulse interval. Inset: Schematic diagram of the measure of PPF. c LTP and LTD for weight updating of proposed SRMs at different voltage pulse intervals. d LTP (blue) and LTD (red) for weight updating of proposed SRMs at the pulse interval of 20 μs. e Cycle-to-cycle variations of weight updating at the pulse interval of 20 μs. f Device-to-device variations of weight updating in one 32 × 32 crossbar array.

Due to the rectification effects, the statistics of paired-pulse suppression are applicable in this context. Synaptic plasticity in biological systems manifests in two primary forms: long-term potentiation (LTP) and long-term depression (LTD). To further investigate these phenomena, we fixed the interval time at 0.2 ms, and explored the dynamic range and linearity of weight updates across various pulse widths (Fig. 3c). Five pulse widths were compared: 20 μs, 0.1 ms, 0.2 ms, 1.0 ms, and 2.0 ms. The results indicate that longer pulse widths (from 0.1 ms to 2.0 ms) exhibit decreased noise relative to 20 μs pulse width, as inadequate linearity in aritificial synaptic updates can severely impair neural network training accuracy and lead to misclassification. At the same time, according to LTD, longer pulse widths lead to a severe weakening of the self-rectifying effect thus worsening the misreading phenomenon in the crossbar array. Conversely, the 20 μs pulse width keeps the self-rectifying effect with minimal noise and great updating linearity. Notably, unlike conventional OV conducting filament-based LTD34, the LTD observed here is constrained by the rectification effect23.

Figure 3d presents the details of LTP and LTD for the artificial synapses based on the proposed SRMs. The subtle C2C variations in weight updating of memristors can significantly interfere with data processing53, making minimal D2D and C2C variations essential for enhancing the accuracy of neuromorphic computations. Figure 3e displays the LTP and LTD characteristics across the 100th, 200th, 300th, 400th, and 500th cycles, demonstrating exceptionally small C2C variations. The main purpose of the LTD in Fig. 3d, e is to ensure that the rectification effect exists and thus the sneak path currents are still effectively suppressed when scaling to larger arrays23. In order to visualize the symmetric LTP/LTD, we tested 104 consecutive LTP/LTD cycles illustrating the excellent endurance of this artificial synapse (Supplementary Fig. 24). We further characterized the randomness of LTP across all SRM cells in a 32 × 32 crossbar array (Fig. 3f). Supplementary Figs. 25 and 26 provide comprehensive statistics on the distribution of 200 LTP simulation states for 1024 SRMs, where there is no overlap between the different states in the crossbar array. The results indicate that the D2D variations become increasingly significant with a higher number of applied pulses. Statistical analysis of D2D and C2C variations for all SRMs in the array is presented in Supplementary Fig. 27. We conducted statistical analyses across 1024 devices (D2D variations) and 104 consecutive LTP cycles (C2C variations) to validate the reproducibility. The low D2D variations of 3.32% and C2C variations of 1.55% confirm that the interfacial switching mechanism, combined with process control, mitigates the variability commonly observed in filamentary memristors6,38, ensuring the ultra-high weight-updating precision of the proposed SRMs for subsequent neuromorphic applications.

ADS based on SRMs

Based on the super-optimal self-rectifying function along with the excellent artificial synaptic properties of SRMs, we further verify the feasibility of applying SRMs to intelligent recognition in ADS. The driving of the SRMs-based crossbar arrays was realized by a printed circuit board (PCB) as shown in Fig. 4a, where Fig. 4b, c are the local zoomed-in chip carrier and SEM image of the 32 × 32 array on the chip carrier, respectively. The peripheral circuitry driving the SRM-based passive crossbar array is shown in Supplementary Fig. 28. We employed the YOLOv5 (Supplementary Fig. 29) and YOLOv9 (Supplementary Fig. 30) models, which are prominent single-stage target detection algorithms. YOLOv9, the latest version, offers significant improvements in efficiency and accuracy compared to YOLOv5 through programmable gradient information and an efficient layer-aggregation network architecture (Fig. 4d). The network structure consists of three components: The Backbone network for feature extraction, the Neck network for enhancing fused features, and the Head network for classification. To validate the combined hardware and software models for real-time traffic conditions, we trained both models using a comprehensive dataset, including the COCO 2017 Dataset, Mapillary Traffic Sign Dataset, LISA Traffic Sign Dataset, Lyft L5 Dataset, and 80% of the Berkeley DeepDrive Dataset (BDD100K)54. We allocated 10% of the BDD100K dataset for validation and the remaining 10% for testing, unifying labels into five categories: pedestrians, vehicles, signposts, obstacles, and stoplights (Supplementary Fig. 31). The connection weights in the neuromorphic architecture (Fig. 4d) were updated in real time using the SRM-based crossbar array.

Fig. 4: Intelligent recognition system for ADS based on SRMs.
Fig. 4: Intelligent recognition system for ADS based on SRMs.
Full size image

a PCB board integrating SRMs-based 32 × 32 crossbar array and driver circuits. b Chip carrier with 32 × 32 crossbar array encapsulated on the PCB board. c SEM image of the 32 × 32 crossbar array. d Neuromorphic model architecture of YOLOv5 and YOLOv9 for the BDD dataset. The classification accuracy of e stoplight, f signpost, g pedestrian, h obstacle, i vehicle, and j total aspects based on the four models: YOLOv5, YOLOv9, M-YOLOv5, and M-YOLOv9, where the sample size is 100. The error bars represent the maximum and minimum intervals of the obtained batch results. k The precision, recall, and F1 score obtained by the four models YOLOv5, YOLOv9, M-YOLOv5, and M-YOLOv9 on the BDD dataset. l The total count of five labels—stoplight, signpost, pedestrian, obstacle, and vehicle in the BDD dataset.

After training, we obtained classification accuracies for the BDD100K dataset, as shown in Fig. 4e–j. The first two columns of the histograms represent the software-based YOLOv5 and YOLOv9 models, while the last two columns correspond to the memristors-based YOLOv5 (M-YOLOv5) and YOLOv9 (M-YOLOv9). All models achieved real-time classification accuracies exceeding 90% for five labels, closely aligning with the requirements for ADS. Notably, the SRM-based models outperformed their software counterparts in stoplight classification due to minimal variations and 200 instantaneous analog states, enhancing weight-updating accuracy. The crosstalk-free crossbar array also aids in feature extraction and noise reduction53. The classification accuracies for green and red lights are detailed in Supplementary Fig. 32. As shown in Fig. 4j, the accuracies for M-YOLOv5 and M-YOLOv9 are nearly identical to those of the software model, differing by only 0.09% and 0.06%, respectively. The SRM-based algorithm offers distinct advantages over conventional YOLOv5 and YOLOv9 architectures in ADS, primarily through its intrinsic analog computing paradigm and hardware-algorithm co-design. Unlike static YOLO models, which rely on fixed-weight inference post-training55, the SRM’s analog tuning capability enables dynamic feature refinement during real-time operation. This is achieved via sub-precision, context-aware weight adjustments that adaptively recalibrate traffic light detectors (e.g., color and shape recognition) under challenging conditions such as fluctuating illumination or partial occlusion—a scenario where traditional YOLO architectures may exhibit suboptimal robustness due to their rigid parameterization56. Furthermore, the analog memristive network circumvents quantization artifacts inherent in digital implementations, which often degrade nuanced feature representations critical for discriminating fine-grained traffic light states (e.g., distinguishing red and yellow signals under glare or fog)57,58. By preserving analog feature fidelity, the SRM-based system mitigates misclassification risks associated with amplitude-truncated digital activations. Additionally, the spatially constrained plasticity of memristive arrays introduces localized learning mechanisms, where weight updates are influenced predominantly by active input regions rather than global network perturbations. This localized adaptation enhances feature-specific learning efficiency, particularly in class-imbalanced traffic scenarios where rare but safety-critical events (e.g., malfunctioning traffic lights) require prioritized representation without destabilizing the broader model59.

We further evaluated model performance using four metrics: precision, recall, and F1 score, which are critical for assessing efficacy. Precision is of central value in reducing false positives and improving decision quality, recall ensures that critical information is not missed, and F1 score quantifies the optimal trade-off point between precision and recall. Detailed metrics are provided in Supplementary Note 2. As depicted in Fig. 4k, the SRMs-based YOLOv5 and YOLOv9 models exhibit comparable metrics to their software-based counterparts, confirming the compatibility of the proposed SRMs with YOLOv9. Additionally, we quantified the distribution of labels in the BDD100K dataset (Fig. 4l). Vehicles account for over 50% of the dataset, while the remaining four labels collectively represent 45%. This distribution indicates that the model’s accuracy for vehicles significantly influences overall classification accuracy. Besides, we added comprehensive experimental results (Supplementary Fig. 33) to explicitly illustrate the effect of pulse size (1 V, 1.5 V, 2 V, 2.5 V, 3 V) and the corresponding LTP/LTD behaviors on ADS recognition accuracy. The results show that the recognition accuracies for all four non-optimal pulse intensities (1 V, 1.5 V, 2.5 V, 3 V) are significantly lower compared to the 2 V benchmark due to negative factors such as high asymmetry and poor linearity in important weight updating.

Anti-attack characteristics of ADS

The feasibility of the proposed SRMs combined with YOLOv5 and YOLOv9 models has been verified. In the IoT era, addressing hardware design cost and power consumption while ensuring the stability of ADS is critical. The SRM arrays reduce hardware size and perform neuromorphic computation with low power consumption, comparable to GPUs24, enhancing the resilience of ADS against attacks from algorithmic or physical sources. Figure 5a illustrates the flow of using SRM arrays to enhance ADS. The vehicle’s camera transmits real-time road video to the SRM array, which minimizes data transmission delays and accelerates decision-making. The refined process is detailed in Fig. 5b, where the details are demonstrated in the Methods module.

Fig. 5: Anti-attack tests for ADS based on SRMs.
Fig. 5: Anti-attack tests for ADS based on SRMs.
Full size image

a Schematic diagram of the hardware-software cooperative anti-attack ADS based on the proposed single-layer SRMs. b Flowchart of real-time classification and decision making for YOLOv9 models combined with SRM arrays. c Sample images of the original traffic dataset with the six attack patterns. d Classification accuracy of YOLOv9 model combined with SRM array, GPU-YOLOv9 model, and YOLOv9 model combined with NSRM array on the BDD100K dataset after six-pattern attacks, where G-P denotes Gray Patches, RGB-P denotes RGB Patches, and R, G, B refers to red, green, and blue, respectively. e PR curves of YOLOv9 model combined with SRM array, GPU-YOLOv9 model, and YOLOv9 model combined with NSRM array on the BDD100K dataset after RGB-patch attacks. Variation of classification accuracy of SRMs-based YOLOv9 model with different f device-to-device variations (different colors correspond to different device states), and different g cycle-to-cycle variations (different colors correspond to state changes of the same device). h Four paradigm scenarios of decision from ADS. i Confusing matrix of four aspects of decision from ADS based on SRMs.

To simulate attacks on the vehicle’s camera, we applied six types of strains to BDD100K images, including blurry patches and RGB patches (Fig. 5c). Other examples of different attack patterns are presented in Supplementary Figs. 3440. The patch attack concept involves covering random image positions with ellipse-shaped patches and the number of patches, the transparency maximum (αmax), are adjustable. Figure 5d shows classification accuracy for the SRM-based model, GPU-based model, and non-self-rectifying memristors (NSRM) model under these attacks. The YOLOv9 model based on NSRM exhibits significantly reduced accuracy due to poor weight-updating precision. NSRMs (the first figure of Supplementary Fig. 13) exhibit poor robustness under attack conditions due to their reliance on filament-dominated conduction40 and the absence of leakage current suppression, suffering from uncontrolled sneak path currents during adversarial perturbations, which amplify noise propagation across the array. This results in feature ambiguity and quantization error accumulation. Attacking patches exploit the lack of spatial selectivity for NSRM-based ADS, causing unintended activation of adjacent synapses and blurring decision boundaries. And digital-to-analog conversion bottlenecks along with accumulated errors in NSRM arrays exacerbate adversarial sensitivity. In contrast, the SRM-based model maintains accuracy comparable to the GPU-based model across all attack types, with RGB patches yielding optimal results. The effects of varying transparency and patch numbers on recognition accuracy are detailed in Supplementary Figs. 41 and 42.

To further evaluate model performance, we analyzed precision and recall using the precision-recall (PR) curve, which illustrates the trade-off between these metrics at different decision thresholds. A well-performing model should have a PR curve that is distant from the diagonal and close to the upper right corner. As shown in Fig. 5e, the SRM-based model’s PR curve closely matches that of the GPU-based model under RGB patch attacks, while the NSRM implementation shows poor performance. We also simulated the impact of D2D and C2C variations on the recognition accuracy. Figures 5f, g indicate that the SRM arrays can tolerate D2D variations up to 10% without significant accuracy lost, while tolerance for C2C variations is limited to about 2%. The ultra-low variations in the proposed SRM arrays are effectively managed through self-correction during weight updates, ensuring robust recognition accuracy. Finally, Fig. 5h presents a schematic diagram of the four decisions made in response to an RGB-patch attack, while the confusion matrix in Fig. 5i illustrates the strong real-time identification and decision-making capabilities of the SRMs-based YOLOv9 model. Moreover, since the proposed SRMs are quasi-volatile, and also the demand for weight retention capability of each cell in the array is high in the hardware-based YOLO model inference process60, we comprehensively calculate the approximate energy consumption required for refreshing and reloading the weights in Supplementary Note 3.

Discussion

In this work, we successfully designed and fabricated a SRM with an ultra-high RR and a simple structure, demonstrating its potential as an attack-resilient technology for autonomous driving. The TiN/HfOx/Pt-based SRM achieves over 108 RR and 105 nonlinearity (NL) after RTA. Experimental results show that the SRM-based ADS achieves classification accuracy comparable to software models, thanks to minimal variations and 200 artificial synaptic states that optimize weight-updating accuracy. Additionally, the SRM maintains an attack-resilient accuracy similar to GPU models across various attack scenarios, underscoring its resilience against algorithmic and physical threats. This SRM technology is poised for widespread application in autonomous driving, promising to advance intelligent transportation systems through its high speed, low power consumption, and enhanced security.

Methods

Preparation of thin films

In this work, all films were prepared using a magnetron sputtering system (DISCOVERY-635, Denton Vacuum) with meticulously controlled parameters to ensure reproducibility. Initially, a 50 nm Pt bottom electrode was sputtered onto SiO2/Si substrates pre-patterned via photolithography (AZ5214 photoresist, 1.2 μm thickness). The Pt deposition was performed at a DC power of 300 W with an Ar gas flow of 37 sccm, a base pressure of 5.0 × 10−6 Torr, and a working pressure of 3.0 mTorr. The substrate temperature was maintained at 25 °C (unheated) with active water cooling to minimize thermal stress, and the deposition rate was calibrated to 0.71 nm/s (70 s duration). Following Pt deposition, the photoresist was removed using acetone and isopropanol ultrasonic baths, followed by N2 drying. Next, a 10 nm HfOx switching layer was reactively sputtered from a HfO2 target (99.995% purity) in the pure Ar atmosphere with the gas flow of 37 sccm. The substrate temperature was actively controlled at 50 °C ± 5 °C to optimize film uniformity and OV density, and the RF power was set to 50 W with a deposition rate of 0.013 nm/s (770 s duration). Post-deposition, the HfOx films underwent RTA in an RTP-VT100M system (ULVAC) under O2 ambient (99.999% purity). The annealing profile included a ramp rate of 25 °C/s, a peak temperature of 150 °C, a dwell time of 600 s, and natural cooling to <50 °C, with the ambient pressure maintained at 10 mTorr to suppress oxygen exchange and stabilize the OV gradient. Subsequently, a 50 nm TiN top electrode was sputtered using a TiN target (99.995% purity) with an Ar gas flow of 37 sccm. The substrate temperature was set to 25 °C (unheated), and the DC power was 200 W with a deposition rate of 0.21 nm/s (240 s duration). Finally, all remaining photoresist was removed via sequential acetone (10 min) and isopropanol (5 min) ultrasonic baths, followed by N2 drying.

Characterization of thin films

High-resolution TEM images of cross-sections were obtained using a Thermo Scientific Helios G4 HX, with samples prepared via the in-situ focused ion beam (FIB) lift-out technique. To protect the sample surface from ion-beam damage during extraction and thinning, electron-beam-deposited platinum (e-PT) and ion-beam deposited platinum (i-Pt) were applied. EDS was conducted using a Super X FEI System to analyze all elemental compositions. Additionally, XPS for the elements of Hf and oxygen O was performed with a Thermo Scientific ESCALAB 250Xi model, with an Al Ka (1486.6 eV) XPS source. AFM characterization was carried out using a Dimension ICON instrument. The AFM images shown in Fig. 1g-j were all obtained from four batches of prepared HfOx films subjected to RTA at 150 °C for 600 s, after the sputtering processes utilizing a power of 50 W for 400 s at a chamber pressure of 3.72 mTorr. UPS and LEIPS were tested with an E03-001 Scanning Focusing Multi-Platform XPS system, model PHI 5000 VersaProve, with an XPS energy resolution of 0.1 eV and a filtering energy of 4.38 eV. The parameters of UPS and calibration parameters for the LEIPS pre-test included an X-ray source with a He1 21.22 eV at 80 W and a beam spot diameter of 5 μm.

Defect migration simulation

The finite element analysis simulations are performed with the COMSOL Multiphysics software. The FEA model in shown below was defined by the law of mass conservation and electroneutrality assumption of related ions without considering the possible side reactions. A 2D model with dimensions of 7 × 5 μm was created in the simulation model, and the effects of ion concentration diffusion and ion migration on ion concentration were considered. The simulation model considers three different models: (1) No potential influence; (2) Electric field distribution from bottom to top; (3) Electric field distribution from top to bottom. By setting up these three models, the influence of the electric field direction on ion migration and diffusion distribution is studied. The diffusion coefficient of each domain is determined based on the relationship between the ionic conductivity and the diffusion coefficient in the Nernst-Einstein equation. Therefore, under the combined action of electric field migration and concentration diffusion, the anions and cations move in different directions, eventually reaching a steady state.

Using the current distribution and concentration diffusion equation to track the current and concentration distribution, the flux of each ion in the electrolyte can be calculated by Nernst-Planck equation:

$${N}_{i}=-{D}_{i}\left(\nabla {c}_{i}+\frac{F}{{RT}}{z}_{i}{c}_{i}\nabla {{{\varnothing }}}_{l}\right),i=1,2,\,\ldots,n$$
(3)

Where Di, Ni, ci, zi, F, T, R, and ϕl denote the diffusion coefficient, the flux, the concentration, the charge number, the Faradaic constant, temperature, gas constant, and electrolyte potential, respectively. l stands for the position along the diffusion region of thickness d (0 <l < d). Based on the steady-state continuity equations and the law of mass conservation (Eq. 4), the electroneutrality assumption was represented as Eq. 5:

$$\frac{\partial {c}_{i}}{\partial t}+\nabla \cdot {N}_{i}=0$$
(4)
$$F{\sum }_{i=1}^{n}{z}_{i}{c}_{i}=0$$
(5)

The boundary condition was set as Eq. 6. The simulation steps include a current distribution initialization and a steady step. Note that the initial defect concentration distribution of the three models is set to be the same.

$$\vec{n}\cdot J=0$$
(6)

The circuitry design of driving system

An 8-layer PCB with a 1.6 mm thick FR4 substrate was custom-designed to accommodate a 32 × 32 SRM-based passive crossbar array and its driving circuitry. The pulse generation and control module, powered by an STM32H743VIT6 microcontroller unit (MCU) with an ARM Cortex-M7 core operating at 480 MHz, managed timing signals and row/column addressing. Four AD5764R digital-to-analog converters (DACs) with 16-bit resolution and ±10 V output generated programming pulses (±2 V, 20 μs) and read pulses (0.5 V, 20 μs). These pulses were buffered and amplified by OPA548 operational amplifiers (Op-Amps) to drive the SRM array’s rows and columns. For row/column selection, 64-channel ADG732 analog switches with ±15 V compliance were used, reducing interconnect complexity. LT3045 low-noise voltage regulators provided stable ±2 V and 0.5 V power supplies, while series resistors (1 kΩ) and transient voltage suppressors (TVS diodes) protected against overcurrent. Precision current-sense amplifiers (TI INA240) with a gain of 100 V/A measured sub-nanoampere currents from the SRMs, which were then digitized by ADS1256 analog-to-digital converters with 24-bit resolution and a 30 kSPS sampling rate, achieving a conductance resolution of 0.1 pS. Data communication with an external computer was facilitated by an FTDI FT601 USB 3.0 controller, supported by a Raspberry Pi Compute Module 4 for GPIO expansion. The PCB layout included guard rings and ground planes to isolate analog and digital sections, minimizing crosstalk, and featured copper pours and heatsinks for thermal management of high-power components.

ADS configurations

By combining the core architecture of the YOLO model with an FPGA-driven 32 × 32 SRM array, the acceleration hardware achieves real-time updating of the four layers of weights in the YOLO model, including the Data Preprocessing Layer, the Feature Extraction Layer, the Feature Fusion Layer consisting of Backbone and Neck modules, and the Head Prediction Layer. First, the data preprocessing layer extracts the local features of the image through a convolution operation (Eq. 7), followed by enhancing the feature representation using the Channel Attention Mechanism (CBAM), which weights the features through the Channel Attention Module and the Spatial Attention Module to enhance the model’s attention to the important features. Finally, the spatial dimensionality of the feature map is reduced by a maximum pooling operation.

$${{{\bf{X}}}}_{{out}}={{{\bf{W}}}}_{{conv}}{{{\bf{X}}}}_{{in}}+{{\bf{b}}}$$
(7)

where \({{{\bf{W}}}}_{{conv}}\) refers to the convolution kernel weights, \({{{\bf{X}}}}_{{in}}\) refers to the input feature map, and \({{\bf{b}}}\) refers to the bias term. The feature extraction layer extracts high-level features through multiple GhostConv modules.The GhostConv modules generate the feature maps through a combination of main convolution and cheap convolution, which significantly reduces the amount of computation (Eq. 8), where each GhostConv module is followed by a CBAM module to further enhance the feature representation.

$${{\bf{X}}}_{out}=\left[{{\bf{Conv}}}_{primary}\left({{\bf{X}}}_{in}\right){;}{{\bf{Conv}}}_{cheap}\left({{\bf{Conv}}}_{primary}\left({{\bf{X}}}_{in}\right)\right)\right]$$
(8)

where \({{{\bf{Conv}}}}_{{primary}}\) represents the main convolutional operation and \({{{\bf{Conv}}}}_{{cheap}}\) is the cheap convolutional operation, both of which are spliced through the channel. The feature fusion layer achieves multi-scale feature fusion through a bidirectional feature pyramid network (BiFPN). The BiFPN fuses features of different scales through top-down (Eq. 9) and bottom-up (Eq. 10) paths to enhance the YOLO model’s detection capability for multi-scale targets.

$${{{\bf{P}}}}_{i}^{{td}}={{\bf{Conv}}}({{{\bf{P}}}}_{i}+{{\bf{UpSample}}}({{{\bf{P}}}}_{i+1}^{{td}}))$$
(9)
$${{{\bf{P}}}}_{i}^{{bu}}={{\bf{Conv}}}{(}{{{\bf{P}}}}_{i}^{{td}}{{\boldsymbol{+}}}{{\bf{MaxPool}}}{(}{{{\bf{P}}}}_{i-1}^{{bu}}{)}{)}$$
(10)

Finally, the dynamic detection head combines scale attention and spatial attention to weight the features for category prediction (Eq. 11) and bounding box regression (Eq. 12), respectively.

$${{\bf{ClassPred}}}={{\bf{Conv}}}({{\bf{X}}}\odot {{\bf{ScaleAtt}}}({{\bf{X}}}))$$
(11)
$${{\bf{RegPred}}}={{\bf{Conv}}}({{\bf{X}}}\odot {{\bf{SpatialAtt}}}({{\bf{X}}}))$$
(12)

where \({{\boldsymbol{\odot }}}\) denotes element-by-element multiplication.

By controlling the analog computational power of the memristor array, the FPGA achieves efficient weight programming and updating. Specifically, the FPGA communicates with the host computer through the interface, transfers the weight matrix to the SRM array in chunks using a control packet, and realizes weight quantization and programming through the DAC configuration.The FPGA divides the large weight matrix into 32 × 32 sub-chunks through the chunk programming strategy, and transfers and programs them into the memristor array block by block. The programming process for each sub-block consists of the following steps: first, the FPGA sends a control packet to start the programming sequence, followed by transmitting the weight data line by line, and finally sends an end packet to complete the programming. During the training phase, the model gradually updates the weights in the FPGA through a gradient accumulation strategy to ensure hardware stability. Specifically, the model calculates the loss function after each forward propagation, and calculates the gradient W through back propagation with the gradient accumulation formula (Eq. 13):

$$\nabla {W}_{{accumulated}}={\sum }_{i=1}^{m}\nabla {W}_{i}$$
(13)

where m is the number of gradient accumulation steps. The accumulated gradients are updated with weights through the optimizer and the updated weights are programmed into the SRM array through the FPGA interface. In the inference stage, the FPGA accelerates the forward propagation process of YOLO model through the analog computing power of the SRM array.

Subsequently, the model outputs detection results including class ID and bounding box, and maps the detection results into discrete state space through state representation function S (Eq. 14). The decision module discretizes the continuous values of the target, such as position and velocity, into a finite number of intervals, thus reducing the complexity of the state space. The discretized data are combined into a multidimensional state vector for describing the overall state of the current environment. This provides clear state inputs to the reinforcement learning algorithm and helps the system to select optimal actions (e.g., braking, steering, accelerating, etc.). Finally, the decision-making module adopts a reinforcement learning framework for real-time decision-making based on the Q-learning algorithm. the Q-value update formula could be despicted as Eq. 15.

$$S=({class\; ID},{distance},{speed})$$
(14)
$$Q\left(s,a\right)\leftarrow \left(1-\alpha \right)Q\left(s,a\right)+\alpha (r+\gamma \max Q({s}^{{\prime} },a^{\prime} ))$$
(15)

where α is the learning rate, γ is the discount factor, and r is the immediate reward. The system selects the optimal policy based on the current state s and the decision action a (brake, decelerate, turn left, turn right, continue driving), and predicts the next state s’ by the state transfer model. The state transfer model is estimated by counting the number of historical transfers (Eq. 16), where \({C}(s,a,{s}^{{\prime} })\) is the number of transfering to state s’ after performing action a from state s.

$$P\left({s}^{{\prime} },|,s,a\right)=\frac{C(s,a,{s}^{{\prime} })}{{\sum}_{{s}^{{\prime} {\prime} }}C(s,a,{s}^{{\prime} {\prime} })}$$
(16)