Abstract
Sorting is a fundamental task in modern computing systems. Hardware sorters are typically based on the von Neumann architecture, and their performance is limited by the data transfer bandwidth and CMOS memory. Sort-in-memory using memristors could help overcome these limitations, but current systems still rely on comparison operations so that sorting performance remains limited. Here we describe a fast and reconfigurable sort-in-memory system that uses digit reads of one-transistor–one-resistor memristor arrays. We develop digit-read tree node skipping, which supports various data quantities and data types. We extend this approach with the multi-bank, bit-slice and multi-level strategies for cross-array tree node skipping. We experimentally show that our comparison-free sort-in-memory system can improve throughput by ×7.70, energy efficiency by ×160.4 and area efficiency by ×32.46 compared with conventional sorting systems. To illustrate the potential of the approach to solve practical sorting tasks, as well as its compatibility with other compute-in-memory schemes, we apply it to Dijkstra’s shortest path search and neural network inference with in situ pruning.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$32.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 digital issues and online access to articles
$119.00 per year
only $9.92 per issue
Buy this article
- Purchase on SpringerLink
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout






Similar content being viewed by others
Data availability
Source data are available on Zenodo at https://doi.org/10.5281/zenodo.15295945 (ref 52). Source data are provided with this paper. Other data that support the findings of this study are available from the corresponding authors upon reasonable request.
Code availability
The core source code that implements the TNS and CA-TNS is available on Zenodo at https://doi.org/10.5281/zenodo.15295945 (ref. 52). Other codes are available from the corresponding authors upon reasonable request.
References
Raihan, M. A. & Aamodt, T. Sparse weight activation training. In Proc. 33rd Conference on Neural Information Processing Systems (eds Larochelle, H. et al.) 15625–15638 (Curran Associates, 2020).
Elsken, T., Metzen, J. H. & Hutter, F. Neural architecture search: a survey. J. Mach. Learn. Res. 20, 1997–2017 (2019).
Graves, A. et al. Hybrid computing using a neural network with dynamic external memory. Nature 538, 471–476 (2016).
Petersen, F., Borgelt, C., Kuehne, H. & Deussen, O. Differentiable sorting networks for scalable sorting and ranking supervision. In Proc. International Conference on Machine Learning (ed. Lawrence, N.) 8546–8555 (PMLR, 2021).
Kim, J. Z. & Bassett, D. S. A neural machine code and programming framework for the reservoir computer. Nat. Mach. Intell. 5, 622–630 (2023).
Tao, Y. & Zhang, Z. HiMA: a fast and scalable history-based memory access engine for differentiable neural computer. In Proc. MICRO-54: 54th Annual IEEE/ACM International Symposium on Microarchitecture (ed. Gizopoulos, D.) 845–856 (IEEE, 2021).
Taniar, D. & Rahayu, J. W. Parallel database sorting. Inf. Sci. 146, 171–219 (2002).
Graefe, G. Implementing sorting in database systems. ACM Comput. Surv. https://doi.org/10.1145/1132960.113296 (2006).
Govindaraju, N., Gray, J., Kumar, R. & Manocha, D. GPUTeraSort: high performance graphics co-processor sorting for large database management. In Proc. 2006 ACM SIGMOD International Conference on Management of Data (eds Yu, C. & Scheuermann, P.) 325–336 (Association for Computing Machinery, 2006).
Salamat, S. et al. NASCENT: near-storage acceleration of database sort on SmartSSD. In Proc. 2021 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays (ed. Shannon, L.) 262–272 (Association for Computing Machinery, 2021).
Brin, S. & Page, L. The anatomy of a large-scale hypertextual web search engine. Comput. Netw. ISDN Syst. 30, 107–117 (1998).
Guan, Z. & Cutrell, E. An eye tracking study of the effect of target rank on web search. In Proc. SIGCHI Conference on Human Factors in Computing Systems (ed. Rosson, M. B.) 417–420 (Association for Computing Machinery, 2007).
Mankowitz, D. J. et al. Faster sorting algorithms discovered using deep reinforcement learning. Nature 618, 257–263 (2023).
Yang, Y. et al. Sorting sub-150-nm liposomes of distinct sizes by DNA-brick-assisted centrifugation. Nat. Chem. 13, 335–342 (2021).
Iram, S. & Hinczewski, M. A molecular motor for cellular delivery and sorting. Nat. Phys. 19, 1081–1082 (2023).
Morris, K. L. et al. Chemically programmed self-sorting of gelator networks. Nat. Commun. 4, 1480 (2013).
Tkachenko, G. & Brasselet, E. Optofluidic sorting of material chirality by chiral light. Nat. Commun. 5, 3577 (2014).
Arnold, M. S., Green, A. A., Hulvat, J. F., Stupp, S. I. & Hersam, M. C. Sorting carbon nanotubes by electronic structure using density differentiation. Nat. Nanotechnol. 1, 60–65 (2006).
Cole, R. Parallel merge sort. SIAM J. Comput. 17, 770–785 (1988).
Hoare, C. A. Quicksort. Comput. J. 5, 10–16 (1962).
Lanius, C. & Gemmeke, T. Multi-function CIM array for genome alignment applications built with fully digital flow. In Proc. 2022 IEEE Nordic Circuits and Systems Conference (NorCAS) (ed. Nurmi, J.) 1–7 (IEEE, 2022).
Li, Z., Challapalle, N., Ramanathan, A. K. & Narayanan, V. IMC-sort: in-memory parallel sorting architecture using hybrid memory cube. In Proc. 2020 on Great Lakes Symposium on VLSI (eds Mohsenin, T. & Zhao, W.) 45–50 (Association for Computing Machinery, 2020).
Qiao, W., Oh, J., Guo, L., Chang, M.-C. F. & Cong, J. Fans: FPGA-accelerated near-storage sorting. In Proc. 2021 IEEE 29th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM) (ed. Bobda, C.) 106–114 (IEEE, 2021).
Wan, W. et al. A compute-in-memory chip based on resistive random-access memory. Nature 608, 504–512 (2022).
Sebastian, A., Le Gallo, M., Khaddam-Aljameh, R. & Eleftheriou, E. Memory devices and applications for in-memory computing. Nat. Nanotechnol. 15, 529–544 (2020).
Yu, S., Jiang, H., Huang, S., Peng, X. & Lu, A. Compute-in-memory chips for deep learning: recent trends and prospects. IEEE Circuits Syst. Mag. 21, 31–56 (2021).
Yan, B. et al. A 1.041-Mb/mm2 27.38-TOPS/W signed-INT8 dynamic-logic-based ADC-less SRAM compute-in-memory macro in 28nm with reconfigurable bitwise operation for AI and embedded applications. In Proc. 2022 IEEE International Solid-State Circuits Conference (ISSCC) (ed. Beigné, É.) vol. 65, 188–190 (IEEE, 2022).
Wu, P.-C. et al. A 22nm 832kb hybrid-domain floating-point SRAM in-memory-compute macro with 16.2–70.2 TFLOPS/W for high-accuracy AI-edge devices. In Proc. 2023 IEEE International Solid-State Circuits Conference (ISSCC) (ed. Cantatore, E.) 126–128 (IEEE, 2023).
Choi, E. et al. A 333TOPS/W logic-compatible multi-level embedded flash compute-in-memory macro with dual-slope computation. In Proc. 2023 IEEE Custom Integrated Circuits Conference (CICC) (ed. Soenen, E.) 1–2 (IEEE, 2023).
Hu, H.-W. et al. A 512GB in-memory-computing 3D-NAND flash supporting similar-vector-matching operations on edge-AI devices. In Proc. 2022 IEEE International Solid-State Circuits Conference (ISSCC) (ed. Beigné, É.) vol. 65 138–140 (IEEE, 2022).
Hung, J.-M. et al. An 8-Mb DC-current-free binary-to-8b precision ReRAM nonvolatile computing-in-memory macro using time-space-readout with 1286.4-21.6 TOPS/W for edge-AI devices. In Proc. 2022 IEEE International Solid-State Circuits Conference (ISSCC) (ed. Zhang, K.) 1–3 (IEEE, 2022).
Huang, W.-H. et al. A nonvolatile AI-edge processor with 4MB SLC-MLC hybrid-mode ReRAM compute-in-memory macro and 51.4-251TOPS/W. In Proc. 2023 IEEE International Solid-State Circuits Conference (ISSCC) (eds Wambacq, P. & O’Mahony, F.) 15–17 (IEEE, 2023).
Cai, F. et al. Power-efficient combinatorial optimization using intrinsic noise in memristor Hopfield neural networks. Nat. Electron. 3, 409–418 (2020).
Wang, R. et al. Implementing in-situ self-organizing maps with memristor crossbar arrays for data mining and optimization. Nat. Commun. 13, 2289 (2022).
Xue, C.-X. et al. A CMOS-integrated compute-in-memory macro based on resistive random-access memory for AI edge devices. Nat. Electron. 4, 81–90 (2021).
Hung, J.-M. et al. A four-megabit compute-in-memory macro with eight-bit precision based on CMOS and resistive random-access memory for ai edge devices. Nat. Electron. 4, 921–930 (2021).
Chiu, Y.-C. et al. A CMOS-integrated spintronic compute-in-memory macro for secure AI edge devices. Nat. Electron. 6, 534–543 (2023).
Khalid, M. Review on various memristor models, characteristics, potential applications, and future works. Trans. Electr. Electron. Mater. 20, 289–298 (2019).
Zidan, M. A. et al. A general memristor-based partial differential equation solver. Nat. Electron. 1, 411–420 (2018).
Alam, M. R., Najafi, M. H. & Taherinejad, N. Sorting in memristive memory. ACM J. Emerg. Technol. Comput. Syst. 18, 1–21 (2022).
Dean, J. & Ghemawat, S. MapReduce: simplified data processing on large clusters. Commun. ACM 51, 107–113 (2008).
Diikstra, E. W. in Edsger Wybe Dijkstra: His Life, Work, and Legacy (eds Apt, K. R. & Hoare, T.) Ch. 13 (ACM Books, 2022).
Qi, C. R., Yi, L., Su, H. & Guibas, L. J. PointNet++: deep hierarchical feature learning on point sets in a metric space. In Proc. 30th Conference on Advances in Neural Information Processing Systems (eds Guyon, I. et al.) (Curran Associates, 2017).
Prasad, A. K., Rezaalipour, M., Dehyadegari, M. & Bojnordi, M. N. Memristive data ranking. In Proc. 2021 IEEE International Symposium on High-Performance Computer Architecture (HPCA) (eds Ahn, J. H. et al.) 440–452 (IEEE, 2021).
Yu, L., Jing, Z., Yang, Y. & Tao, Y. Fast and scalable memristive in-memory sorting with column-skipping algorithm. In 2022 IEEE International Symposium on Circuits and Systems (ISCAS) (eds Kang, S.-M. (S.), et al.) 590–594 (IEEE, 2022).
Yao, P. et al. Fully hardware-implemented memristor convolutional neural network. Nature 577, 641–646 (2020).
Lastras-Montaño, M. A. et al. Ratio-based multi-level resistive memory cells. Sci. Rep. 11, 1351 (2021).
Cheriton, D. & Tarjan, R. E. Finding minimum spanning trees. SIAM J. Comput. 5, 724–742 (1976).
Lin, L., Cao, H. & Luo, Z. Dijkstra’s algorithm-based ray tracing method for total focusing method imaging of CFRP laminates. Compos. Struct. 215, 298–304 (2019).
Mahmoud, M. et al. Tensordash: exploiting sparsity to accelerate deep neural network training. In Proc. 53rd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO) (ed. Gizopoulos, D.) 781–795 (IEEE, 2020).
Yousefzadeh, A. & Sifalakis, M. Training for temporal sparsity in deep neural networks, application in video processing. Preprint at https://arxiv.org/abs/2107.07305 (2021).
Yu, L. A fast and reconfigurable sort-in-memory system based on memristors. Zenodo https://doi.org/10.5281/zenodo.15295945 (2025).
Acknowledgements
This work was supported by the National Key R&D Program of China (Grant No. 2023YFB4502200), Guangdong Provincial Key Laboratory of In-Memory Computing Chips (Grant No. 2024B1212020002), the National Natural Science Foundation of China (Grant Nos 61925401, 61927901, 8206100486, 92164302 and 92464203), Beijing Natural Science Foundation (Grant Nos L234026 and F251035) and the 111 Project (Grant No. B18001). Y.Y. acknowledges support from the Fok Ying-Tong Education Foundation.
Author information
Authors and Affiliations
Contributions
L.Y. and Y.T. designed the entire concept and experiment. T.Z. fabricated and characterized the devices. L.Y. and Y.T. were in charge of the hardware system integration, designed the experimental methodologies for each part and conducted the related data analyses. L.Y., Y.T., Z.W., X.W., Z.P., B.W., Z.J., J.L, Y.L., Z.X. and Y.Z. contributed to memristor testing, circuit design, PCB integration, hardware system verification and software simulations. L.Y. and Y.T. contributed to the interpretation of the results. L.Y., Y.T. and Y.Y. wrote the paper with input from all authors. B.Y., Y.T. and Y.Y. supervised the whole project.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Peer review
Peer review information
Nature Electronics thanks Themis Prodromakis and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data
Extended Data Fig. 1 Sorting example with multi-bank CA-TNS strategy.
Sorting four 4 unsigned numbers (2, 3, 9, 14) in ascending order.
Extended Data Fig. 2 Sorting example with bit-slice CA-TNS strategy.
Sorting four 4 unsigned numbers (2, 3, 9, 14) in ascending order.
Extended Data Fig. 3 Sorting example with multi-level CA-TNS strategy.
Sorting four 4 unsigned numbers (2, 3, 9, 14) in ascending order.
Supplementary information
Supplementary Information
Supplementary Sections 1–18, Figs. 1–32 and Tables 1–9.
Supplementary Video 1
CA-TNS sorting.
Supplementary Video 2
Sorting with in situ pruning.
Source data
Source Data Fig. 2
Source data for Fig. 2.
Source Data Fig. 4
Source data for Fig. 4.
Source Data Fig. 5
Source data for Fig. 5.
Source Data Fig. 6
Source data for Fig. 6.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Yu, L., Zhang, T., Wang, Z. et al. A fast and reconfigurable sort-in-memory system based on memristors. Nat Electron 8, 597–609 (2025). https://doi.org/10.1038/s41928-025-01405-2
Received:
Accepted:
Published:
Issue date:
DOI: https://doi.org/10.1038/s41928-025-01405-2