Abstract
Artificial intelligence edge devices should offer high inference accuracy and rapid response times, as well as being energy efficient. Ensuring the security of these devices against malicious attacks and illegal access requires data protection mechanisms and secure access control. Here we report a spintronic non-volatile compute-in-memory macro for efficient dot-product edge computing with secure access control for activation, key and data protection against power-on and power-off probing. The approach relies on spintronic-based physically unclonable functions and two-dimensional half-complement physical encryption, as well as a snoop-proof self-decryption burst-read scheme in conjunction with a sparsity-and-rectified-linear-unit-aware early-termination compute-in-memory engine. The 6.6 megabit complementary metal–oxide–semiconductor (CMOS)-integrated macro uses 22 nm spin-transfer torque magnetic random-access memory technology. The macro achieves high randomness (inter-Hamming distance, 0.4999) and high reliability for physically unclonable functionality (intra-Hamming distance, 0), as well as a high energy efficiency for dot-product computation (between 30.1 and 68.0 tera-operations per second per watt).
This is a preview of subscription content, access via your institution
Access options




Similar content being viewed by others
Data availability
The data that support the plots presented in this paper as well as other findings derived from this study are available from the corresponding author upon reasonable request.
Code availability
The code that supports the experiment platforms is available from the corresponding author upon reasonable request.
References
Chou, C.-C. et al. A 22nm 96KX144 RRAM macro with a self-tracking reference and a low ripple charge pump to achieve a configurable read window and a wide operating voltage range. In 2020 IEEE Symposium on VLSI Circuits 1–2 (IEEE, 2020).
Chih, Y. et al. A 22nm 32Mb embedded STT-MRAM with 10ns read speed, 1M cycle write endurance, 10 years retention at 150 °C and high immunity to magnetic field interference. In 2020 IEEE International Solid-State Circuits Conference—(ISSCC) 222–224 (IEEE, 2020).
Chang, T. et al. A 22nm 1Mb 1024b-read and near-memory-computing dual-mode STT-MRAM macro with 42.6GB/s read bandwidth for security-aware mobile devices. In 2020 IEEE International Solid-State Circuits Conference—(ISSCC) 224–226 (IEEE, 2020).
Chiu, Y.-C. et al. A 40 nm 2 Mb ReRAM macro with 85% reduction in FORMING time and 99% reduction in page-write time using auto-FORMING and auto-write schemes. In Symposium on VLSI Technology T232–T233 (IEEE, 2019).
Jain, P. et al. A 3.6 Mb 10.1 Mb/mm2 embedded non-volatile ReRAM macro in 22 nm FinFET technology with adaptive forming/set/reset schemes yielding down to 0.5 V with sensing time of 5 ns at 0.7 V. In IEEE International Solid-State Circuits Conference (ISSCC) Digest of Technical Papers 212–214 (IEEE, 2019).
Dong, Q. et al. A 1 Mb 28 nm STT-MRAM with 2.8 ns read access time at 1.2 V VDD using single-cap offset-cancelled sense amplifier and in-situ self-write-termination. In IEEE International Solid-State Circuits Conference (ISSCC) Digest of Technical Papers 480–482 (IEEE, 2018).
Yang, T. et al. A 28 nm 32 Kb embedded 2T2MTJ STT-MRAM macro with 1.3 ns read-access time for fast and reliable read applications. In IEEE International Solid-State Circuits Conference (ISSCC) Digest of Technical Papers 482–484 (IEEE, 2018).
Liu, P. et al. A 65 nm ReRAM-enabled nonvolatile processor with 6× reduction in restore time and 4× higher clock frequency using adaptive data retention and self-write-termination nonvolatile logic. In IEEE International Solid-State Circuits Conference (ISSCC) Digest of Technical Papers 84–86 (IEEE, 2016).
Lee, C.-F. et al. A 1.4 Mb 40-nm embedded ReRAM macro with 0.07 µm2 bit cell, 2.7 mA/100 MHz low-power read and hybrid write verify for high endurance application. In IEEE Asian Solid-State Circuits Conference (A-SSCC) 9–12 (IEEE, 2017).
Chang, M.-F. et al. Embedded 1 Mb ReRAM in 28 nm CMOS with 0.27-to1 V read using swing-sample-and-couple sense amplifier and self-boost-write-termination scheme. In IEEE International Solid-State Circuits Conference (ISSCC) Digest of Technical Papers 332–334 (IEEE, 2014).
Xue, C.-X. et al. A 22 nm 4 Mb 8b-precision ReRAM computing-in-memory macro with 11.91 to 195.7TOPS/W for tiny AI edge devices. In IEEE International Solid-State Circuits Conference (ISSCC) Digest of Technical Papers 245–247 (IEEE, 2021).
Xue, C.-X. et al. A CMOS-integrated compute-in-memory macro based on resistive random-access memory for AI edge devices. Nat. Electron. 4, 81–90 (2021).
Xue, C.-X. et al. A 22 nm 2 Mb ReRAM compute-in-memory macro with 121-28TOPS/W for multibit MAC computing for tiny AI edge devices. In IEEE International Solid-State Circuits Conference (ISSCC) Digest of Technical Papers 244–245 (IEEE, 2020).
Xue, C.-X. et al. A 1 Mb multibit ReRAM computing-in-memory macro with 14.6 ns parallel MAC computing time for CNN-based AI edge processors. In IEEE International Solid-State Circuits Conference (ISSCC) Digest of Technical Papers 388–390 (IEEE, 2019).
Chen, W.-H. et al. CMOS-integrated memristive non-volatile computing-in-memory for AI edge processors.Nat. Electron. 2, 420–428 (2019).
Tang, K.-T. et al. Considerations of integrating computing-in-memory and processing-in-sensor into convolutional neural network accelerators for low-power edge devices. In IEEE Symposium on VLSI Technology 166–167 (IEEE, 2019).
Chen, W.-H. et al. A 65 nm 1 Mb nonvolatile computing-in-memory ReRAM macro with sub-16 ns multiply-and-accumulate for binary DNN AI edge processors. In IEEE International Solid-State Circuits Conference (ISSCC) Digest of Technical Papers 494–496 (IEEE, 2018).
Yoon, J.-H. et al. A 40 nm 64 Kb 56.67TOPS/W read-disturb-tolerant compute-in-memory/digital RRAM macro with active-feedback-based read and in-situ write verification. In IEEE International Solid-State Circuits Conference (ISSCC) Digest of Technical Papers 404–406 (IEEE, 2021).
Khaddam-Aljameh, R. et al. HERMES core—a 14 nm CMOS and PCM-based in-memory compute core using an array of 300 ps/LSB linearized CCO-based ADCs and local digital processing. In IEEE Symposium on VLSI Circuits 1–2 (IEEE, 2021).
Wan, W. et al. A 74 TMACS/W CMOS-RRAM neurosynaptic core with dynamically reconfigurable dataflow and in-situ transposable weights for probabilistic graphical models. In IEEE International Solid-State Circuits Conference (ISSCC) Digest of Technical Papers 498–499 (IEEE, 2020).
Yao, P. et al. Fully hardware-implemented memristor convolutional neural network. Nature 577, 641–646 (2020).
Liu, Q. et al. A fully integrated analog ReRAM based 78.4TOPS/W compute-in-memory chip with fully parallel MAC computing. In IEEE International Solid-State Circuits Conference (ISSCC) Digest of Technical Papers 500–501 (IEEE, 2020).
Cai, F. et al. A fully integrated reprogrammable memristor–CMOS system for efficient multiply–accumulate operations. Nat. Electron. 2, 290–299 (2019).
Mochida, R. et al. A 4M synapses integrated analog ReRAM based 66.5 TOPS/W neural-network processor with cell current controlled writing and flexible network architecture. In IEEE Symposium on VLSI Technology 175–176 (IEEE, 2018).
Ielmini, D. et al. In-memory computing with resistive switching devices. Nat. Electron. 1, 333–343 (2018).
Li, C. et al. Analogue signal and image processing with large memristor crossbars. Nat. Electron. 1, 52–59 (2018).
Boybat, I. et al. Neuromorphic computing with multi-memristive synapses. Nat. Commun. 9, 2514 (2018).
Wang, Z. et al. Fully memristive neural networks for pattern classification with unsupervised learning. Nat. Electron. 1, 137–145 (2018).
Le Gallo, M. et al. Mixed-precision in-memory computing. Nat. Electron. 1, 246–253 (2018).
Ambrogio, S. et al. Equivalent-accuracy accelerated neural-network training using analogue memory. Nature 558, 60–67 (2018).
Zidan, M.-A. et al. The future of electronics based on memristive systems. Nat. Electron. 1, 22–29 (2018).
Pang, Y. et al. A reconfigurable RRAM physically unclonable function utilizing post-process randomness source with <6 × 10−6 native bit error rate. In IEEE International Solid-State Circuits Conference (ISSCC) Digest of Technical Papers 402–404 (IEEE, 2019).
Xue, X. et al. A 28 nm 512 Kb adjacent 2T2R RRAM PUF with interleaved cell mirroring and self-adaptive splitting for extremely low bit error rate of cryptographic key. In IEEE Asian Solid-State Circuits Conference (A-SSCC) 29–32 (IEEE, 2019).
Wu, M.-Y. et al. A PUF scheme using competing oxide rupture with bit error rate approaching zero. In IEEE International Solid-State Circuits Conference (ISSCC) Digest of Technical Papers 130–132 (IEEE, 2018).
Gallagher, W.-J. et al. 22 nm STT-MRAM for reflow and automotive uses with high yield, reliability, and magnetic immunity and with performance and shielding options. In IEEE International Electron Devices Meeting (IEDM) 2.7.1–2.7.4 (IEEE, 2019).
Chang, M. & Shen, S. A process variation tolerant embedded split-gate flash memory using pre-stable current sensing scheme. IEEE J. Solid-State Circuits 44, 987–994 (2009).
Nair, V. & Hinton, G.-E. Rectified linear units improve restricted Boltzmann machines. In Proc. 27th International Conference on Machine Learning (ICML'10) 807–814 (IEEE, 2010).
Chen, Z. et al. A 65 nm 3T dynamic analog RAM-based computing-in-memory macro and CNN accelerator with retention enhancement, adaptive analog sparsity and 44TOPS/W system energy efficiency. In IEEE International Solid-State Circuits Conference (ISSCC) Digest of Technical Papers 240–242 (IEEE, 2021).
Krizhevsky, A. Learning Multiple Layers of Features from Tiny Images. Tech Report (2009).
He, K. et al. Deep residual learning for image recognition. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 770–778 (IEEE, 2016).
Yuval, N. et al. Reading digits in natural images with unsupervised feature learning. In Proc. NIPS Workshop on Deep Learning and Unsupervised Feature Learning (2011).
Menze, B.-H. et al. The multimodal brain tumor image segmentation benchmark (BRATS). IEEE Trans. Med. Imaging 34, 1993–2024 (2015).
Federal Information Processing Standards (FIPS) Publication 180-2. Secure Hash Standard (SHS). US DoC/NIST (2002).
Chiu, Y.-C. et al. A 22 nm 4 Mb STT-MRAM data-encrypted near-memory computation macro with a 192 GB/s read-and-decryption bandwidth and 25.1-55.1TOPS/W 8b MAC for AI operations. In IEEE International Solid-State Circuits Conference (ISSCC) Digest of Technical Papers 178–180 (IEEE, 2022).
Jung, S. et al. A crossbar array of magnetoresistive memory devices for in-memory computing. Nature 601, 211–216 (2022).
Swami, S., Rakshit, J. & Mohanram, K. SECRET: smartly encrypted energy efficient non-volatile memories. In Proc. 53rd Annual Design Automation Conference 166 (ACM, 2016).
Huang, S., Jiang, H., Peng, X, Li, W. & Yu, S. XOR-CIM: compute-in-memory SRAM architecture with embedded XOR encryption. In 2020 IEEE/ACM International Conference On Computer Aided Design (ICCAD) 77 (ACM, 2020).
Li, W., Huang, S., Sun, X., Jiang H. & Yu, S. Secure-RRAM: a 40 nm 16 kb compute-in-memory macro with reconfigurability, sparsity control, and embedded security. In 2021 IEEE Custom Integrated Circuits Conference (CICC) 1–2 (IEEE, 2021).
Chiu, Y.-C. et al. A 22-nm 1-Mb 1024-b read data-protected STT-MRAM macro with near-memory shift-and-rotate functionality and 42.6-GB/s read bandwidth for security-aware mobile device. IEEE J. Solid-State Circuits 57, 1936–1949 (2022).
Acknowledgements
We express our appreciation for support from NTHU, TSMC and NSTC, Taiwan.
Author information
Authors and Affiliations
Contributions
Y.-C.C. designed the circuits for the nvCIM macro and test chip. Y.-C.C., W.-S.K., C.-S.Y., S.-H.T., H.-Y.H., F.-C.C., Y.W., Y.-A.C., F.-L.H., C.-Y.L., G.-Y.L., P.-J.C., T.-H.P., C.-C.L., R.-S.L., C.-C.H., K.-T.T., M.-S.H. and M.-F.C. contributed ideas. C.-S.Y., S.-H.T., H.-Y.H., F.-C.C., Y.W., Y.-A.C., F.-L.H., C.-Y.L., G.-Y.L., P.-J.C. and T.-H.P. built the test measurement system and testing flow for the STT-MRAM nvCIM macro. F.-C.C., Y.W., G.-Y.L. and C.-Y.L. conducted the security experiment and AI demonstration system. Y.-C.C. performed the analysis and measurements of the nvCIM macro. C.-P.L., Y.-D.C., T.-Y.J.C. and M.-F.C. managed the project. Y.-C.C., M.-S.H. and M.-F.C. wrote the paper.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Peer review
Peer review information
Nature Electronics thanks the anonymous reviewers for their contribution to the peer review of this work.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data
Extended Data Fig. 1 Experiment platform used to assess AI inference operations of the nvCIM macro with Resnet20 model.
a, Experiment platform including nvCIM testchip, FPGA board as system controller with intermediate data processor, and a PC to display classification results on an LCD screen. This nvCIM test-chip eliminated the drop in inference accuracy typical of software-based inference, regardless of the dataset SVHN, CIFAR-10, BraTS, and CIFAR-100. b, Flow chart showing inference process implemented on experiment platform. The nvCIM macro performed full-channel dot-product operations (convolutions), while the FPGA fed the input to the nvCIM macro, collected the full-channel dot-product values (Dot-PVFC) from the nvCIM macro, performed pooling operations, and executed the 1st-layer convolution operation. The PC platform presented intermediate data generated during inference and the final inference results on an LCD screen.
Extended Data Fig. 2 Experiment involving chip-ID identification using the nvCIM macro.
a, This experiment was implemented using an nvCIM test-chip, an FPGA as a system controller, and a PC to display chip-ID comparison results. b, Flow chart of chip-ID identification operation. The 6.6 Mb nvCIM macro created a chip-ID, while the FPGA fed the challenge to the nvCIM macro and collected the chip-ID. The nvCIM macro verified the chip-ID, while the FPGA fed the challenge and external chip-ID (chip-IDE) and stored the chip-ID comparison results. The PC was used as a command interface presenting the comparison results and chip-IDs on the LCD screen.
Extended Data Fig. 3 Experiment on accessing secured data using the nvCIM macro.
a, Fabricated 22 nm 6.6 Mb nvCIM test-chip used with an FPGA as system controller and PC to display the comparison results of accessing the secured data on an LCD screen. b, Flow chart showing the process by which secured data was accessed, wherein the FPGA fed the challenge, write command, and owner-KEY to the nvCIM macro, which subsequently performed a secure data-write operation. The data-read operation involved the FPGA feeding the challenge, read command, and user-KEY to the nvCIM macro to perform the secure data-read operation. Finally, the FPGA passed output data from the nvCIM macro to the PC, which compared decrypted data from the nvCIM macro with target plaintext data, the results of which were displayed on an LCD screen.
Supplementary information
Supplementary Information (download PDF )
Supplementary Figs. 1–6 and Notes 1–4.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Chiu, YC., Khwa, WS., Yang, CS. et al. A CMOS-integrated spintronic compute-in-memory macro for secure AI edge devices. Nat Electron 6, 534–543 (2023). https://doi.org/10.1038/s41928-023-00994-0
Received:
Accepted:
Published:
Version of record:
Issue date:
DOI: https://doi.org/10.1038/s41928-023-00994-0
This article is cited by
-
Self-Rectifying Memristors for Beyond-CMOS Computing: Mechanisms, Materials, and Integration Prospects
Nano-Micro Letters (2026)
-
Self-rectifying memristors with high rectification ratio for attack-resilient autonomous driving systems
Nature Communications (2025)
-
Radiofrequency signal processing with a memristive system-on-a-chip
Nature Electronics (2025)
-
A near-threshold memristive computing-in-memory engine for edge intelligence
Nature Communications (2025)
-
Demonstration of high-reconfigurability and low-power strong physical unclonable function empowered by FeFET cycle-to-cycle variation and charge-domain computing
Nature Communications (2025)


