Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

A fast and reconfigurable sort-in-memory system based on memristors

Abstract

Sorting is a fundamental task in modern computing systems. Hardware sorters are typically based on the von Neumann architecture, and their performance is limited by the data transfer bandwidth and CMOS memory. Sort-in-memory using memristors could help overcome these limitations, but current systems still rely on comparison operations so that sorting performance remains limited. Here we describe a fast and reconfigurable sort-in-memory system that uses digit reads of one-transistor–one-resistor memristor arrays. We develop digit-read tree node skipping, which supports various data quantities and data types. We extend this approach with the multi-bank, bit-slice and multi-level strategies for cross-array tree node skipping. We experimentally show that our comparison-free sort-in-memory system can improve throughput by ×7.70, energy efficiency by ×160.4 and area efficiency by ×32.46 compared with conventional sorting systems. To illustrate the potential of the approach to solve practical sorting tasks, as well as its compatibility with other compute-in-memory schemes, we apply it to Dijkstra’s shortest path search and neural network inference with in situ pruning.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Overview of sorting systems.
Fig. 2: Memristor programming and TNS.
Fig. 3: TNS and CA-TNS hardware architecture and operation flow.
Fig. 4: CA-TNS strategies.
Fig. 5: TNS experiment on shortest path search.
Fig. 6: In situ pruning for PointNet++.

Similar content being viewed by others

Data availability

Source data are available on Zenodo at https://doi.org/10.5281/zenodo.15295945 (ref 52). Source data are provided with this paper. Other data that support the findings of this study are available from the corresponding authors upon reasonable request.

Code availability

The core source code that implements the TNS and CA-TNS is available on Zenodo at https://doi.org/10.5281/zenodo.15295945 (ref. 52). Other codes are available from the corresponding authors upon reasonable request.

References

  1. Raihan, M. A. & Aamodt, T. Sparse weight activation training. In Proc. 33rd Conference on Neural Information Processing Systems (eds Larochelle, H. et al.) 15625–15638 (Curran Associates, 2020).

  2. Elsken, T., Metzen, J. H. & Hutter, F. Neural architecture search: a survey. J. Mach. Learn. Res. 20, 1997–2017 (2019).

    MathSciNet  Google Scholar 

  3. Graves, A. et al. Hybrid computing using a neural network with dynamic external memory. Nature 538, 471–476 (2016).

    Article  Google Scholar 

  4. Petersen, F., Borgelt, C., Kuehne, H. & Deussen, O. Differentiable sorting networks for scalable sorting and ranking supervision. In Proc. International Conference on Machine Learning (ed. Lawrence, N.) 8546–8555 (PMLR, 2021).

  5. Kim, J. Z. & Bassett, D. S. A neural machine code and programming framework for the reservoir computer. Nat. Mach. Intell. 5, 622–630 (2023).

  6. Tao, Y. & Zhang, Z. HiMA: a fast and scalable history-based memory access engine for differentiable neural computer. In Proc. MICRO-54: 54th Annual IEEE/ACM International Symposium on Microarchitecture (ed. Gizopoulos, D.) 845–856 (IEEE, 2021).

  7. Taniar, D. & Rahayu, J. W. Parallel database sorting. Inf. Sci. 146, 171–219 (2002).

    Article  MathSciNet  Google Scholar 

  8. Graefe, G. Implementing sorting in database systems. ACM Comput. Surv. https://doi.org/10.1145/1132960.113296 (2006).

  9. Govindaraju, N., Gray, J., Kumar, R. & Manocha, D. GPUTeraSort: high performance graphics co-processor sorting for large database management. In Proc. 2006 ACM SIGMOD International Conference on Management of Data (eds Yu, C. & Scheuermann, P.) 325–336 (Association for Computing Machinery, 2006).

  10. Salamat, S. et al. NASCENT: near-storage acceleration of database sort on SmartSSD. In Proc. 2021 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays (ed. Shannon, L.) 262–272 (Association for Computing Machinery, 2021).

  11. Brin, S. & Page, L. The anatomy of a large-scale hypertextual web search engine. Comput. Netw. ISDN Syst. 30, 107–117 (1998).

    Article  Google Scholar 

  12. Guan, Z. & Cutrell, E. An eye tracking study of the effect of target rank on web search. In Proc. SIGCHI Conference on Human Factors in Computing Systems (ed. Rosson, M. B.) 417–420 (Association for Computing Machinery, 2007).

  13. Mankowitz, D. J. et al. Faster sorting algorithms discovered using deep reinforcement learning. Nature 618, 257–263 (2023).

    Article  Google Scholar 

  14. Yang, Y. et al. Sorting sub-150-nm liposomes of distinct sizes by DNA-brick-assisted centrifugation. Nat. Chem. 13, 335–342 (2021).

    Article  Google Scholar 

  15. Iram, S. & Hinczewski, M. A molecular motor for cellular delivery and sorting. Nat. Phys. 19, 1081–1082 (2023).

  16. Morris, K. L. et al. Chemically programmed self-sorting of gelator networks. Nat. Commun. 4, 1480 (2013).

    Article  Google Scholar 

  17. Tkachenko, G. & Brasselet, E. Optofluidic sorting of material chirality by chiral light. Nat. Commun. 5, 3577 (2014).

    Article  Google Scholar 

  18. Arnold, M. S., Green, A. A., Hulvat, J. F., Stupp, S. I. & Hersam, M. C. Sorting carbon nanotubes by electronic structure using density differentiation. Nat. Nanotechnol. 1, 60–65 (2006).

    Article  Google Scholar 

  19. Cole, R. Parallel merge sort. SIAM J. Comput. 17, 770–785 (1988).

    Article  MathSciNet  Google Scholar 

  20. Hoare, C. A. Quicksort. Comput. J. 5, 10–16 (1962).

    Article  MathSciNet  Google Scholar 

  21. Lanius, C. & Gemmeke, T. Multi-function CIM array for genome alignment applications built with fully digital flow. In Proc. 2022 IEEE Nordic Circuits and Systems Conference (NorCAS) (ed. Nurmi, J.) 1–7 (IEEE, 2022).

  22. Li, Z., Challapalle, N., Ramanathan, A. K. & Narayanan, V. IMC-sort: in-memory parallel sorting architecture using hybrid memory cube. In Proc. 2020 on Great Lakes Symposium on VLSI (eds Mohsenin, T. & Zhao, W.) 45–50 (Association for Computing Machinery, 2020).

  23. Qiao, W., Oh, J., Guo, L., Chang, M.-C. F. & Cong, J. Fans: FPGA-accelerated near-storage sorting. In Proc. 2021 IEEE 29th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM) (ed. Bobda, C.) 106–114 (IEEE, 2021).

  24. Wan, W. et al. A compute-in-memory chip based on resistive random-access memory. Nature 608, 504–512 (2022).

    Article  Google Scholar 

  25. Sebastian, A., Le Gallo, M., Khaddam-Aljameh, R. & Eleftheriou, E. Memory devices and applications for in-memory computing. Nat. Nanotechnol. 15, 529–544 (2020).

    Article  Google Scholar 

  26. Yu, S., Jiang, H., Huang, S., Peng, X. & Lu, A. Compute-in-memory chips for deep learning: recent trends and prospects. IEEE Circuits Syst. Mag. 21, 31–56 (2021).

    Article  Google Scholar 

  27. Yan, B. et al. A 1.041-Mb/mm2 27.38-TOPS/W signed-INT8 dynamic-logic-based ADC-less SRAM compute-in-memory macro in 28nm with reconfigurable bitwise operation for AI and embedded applications. In Proc. 2022 IEEE International Solid-State Circuits Conference (ISSCC) (ed. Beigné, É.) vol. 65, 188–190 (IEEE, 2022).

  28. Wu, P.-C. et al. A 22nm 832kb hybrid-domain floating-point SRAM in-memory-compute macro with 16.2–70.2 TFLOPS/W for high-accuracy AI-edge devices. In Proc. 2023 IEEE International Solid-State Circuits Conference (ISSCC) (ed. Cantatore, E.) 126–128 (IEEE, 2023).

  29. Choi, E. et al. A 333TOPS/W logic-compatible multi-level embedded flash compute-in-memory macro with dual-slope computation. In Proc. 2023 IEEE Custom Integrated Circuits Conference (CICC) (ed. Soenen, E.) 1–2 (IEEE, 2023).

  30. Hu, H.-W. et al. A 512GB in-memory-computing 3D-NAND flash supporting similar-vector-matching operations on edge-AI devices. In Proc. 2022 IEEE International Solid-State Circuits Conference (ISSCC) (ed. Beigné, É.) vol. 65 138–140 (IEEE, 2022).

  31. Hung, J.-M. et al. An 8-Mb DC-current-free binary-to-8b precision ReRAM nonvolatile computing-in-memory macro using time-space-readout with 1286.4-21.6 TOPS/W for edge-AI devices. In Proc. 2022 IEEE International Solid-State Circuits Conference (ISSCC) (ed. Zhang, K.) 1–3 (IEEE, 2022).

  32. Huang, W.-H. et al. A nonvolatile AI-edge processor with 4MB SLC-MLC hybrid-mode ReRAM compute-in-memory macro and 51.4-251TOPS/W. In Proc. 2023 IEEE International Solid-State Circuits Conference (ISSCC) (eds Wambacq, P. & O’Mahony, F.) 15–17 (IEEE, 2023).

  33. Cai, F. et al. Power-efficient combinatorial optimization using intrinsic noise in memristor Hopfield neural networks. Nat. Electron. 3, 409–418 (2020).

    Article  Google Scholar 

  34. Wang, R. et al. Implementing in-situ self-organizing maps with memristor crossbar arrays for data mining and optimization. Nat. Commun. 13, 2289 (2022).

    Article  Google Scholar 

  35. Xue, C.-X. et al. A CMOS-integrated compute-in-memory macro based on resistive random-access memory for AI edge devices. Nat. Electron. 4, 81–90 (2021).

    Article  Google Scholar 

  36. Hung, J.-M. et al. A four-megabit compute-in-memory macro with eight-bit precision based on CMOS and resistive random-access memory for ai edge devices. Nat. Electron. 4, 921–930 (2021).

    Article  Google Scholar 

  37. Chiu, Y.-C. et al. A CMOS-integrated spintronic compute-in-memory macro for secure AI edge devices. Nat. Electron. 6, 534–543 (2023).

  38. Khalid, M. Review on various memristor models, characteristics, potential applications, and future works. Trans. Electr. Electron. Mater. 20, 289–298 (2019).

    Article  Google Scholar 

  39. Zidan, M. A. et al. A general memristor-based partial differential equation solver. Nat. Electron. 1, 411–420 (2018).

    Article  Google Scholar 

  40. Alam, M. R., Najafi, M. H. & Taherinejad, N. Sorting in memristive memory. ACM J. Emerg. Technol. Comput. Syst. 18, 1–21 (2022).

    Google Scholar 

  41. Dean, J. & Ghemawat, S. MapReduce: simplified data processing on large clusters. Commun. ACM 51, 107–113 (2008).

    Article  Google Scholar 

  42. Diikstra, E. W. in Edsger Wybe Dijkstra: His Life, Work, and Legacy (eds Apt, K. R. & Hoare, T.) Ch. 13 (ACM Books, 2022).

  43. Qi, C. R., Yi, L., Su, H. & Guibas, L. J. PointNet++: deep hierarchical feature learning on point sets in a metric space. In Proc. 30th Conference on Advances in Neural Information Processing Systems (eds Guyon, I. et al.) (Curran Associates, 2017).

  44. Prasad, A. K., Rezaalipour, M., Dehyadegari, M. & Bojnordi, M. N. Memristive data ranking. In Proc. 2021 IEEE International Symposium on High-Performance Computer Architecture (HPCA) (eds Ahn, J. H. et al.) 440–452 (IEEE, 2021).

  45. Yu, L., Jing, Z., Yang, Y. & Tao, Y. Fast and scalable memristive in-memory sorting with column-skipping algorithm. In 2022 IEEE International Symposium on Circuits and Systems (ISCAS) (eds Kang, S.-M. (S.), et al.) 590–594 (IEEE, 2022).

  46. Yao, P. et al. Fully hardware-implemented memristor convolutional neural network. Nature 577, 641–646 (2020).

    Article  Google Scholar 

  47. Lastras-Montaño, M. A. et al. Ratio-based multi-level resistive memory cells. Sci. Rep. 11, 1351 (2021).

    Article  Google Scholar 

  48. Cheriton, D. & Tarjan, R. E. Finding minimum spanning trees. SIAM J. Comput. 5, 724–742 (1976).

    Article  MathSciNet  Google Scholar 

  49. Lin, L., Cao, H. & Luo, Z. Dijkstra’s algorithm-based ray tracing method for total focusing method imaging of CFRP laminates. Compos. Struct. 215, 298–304 (2019).

    Article  Google Scholar 

  50. Mahmoud, M. et al. Tensordash: exploiting sparsity to accelerate deep neural network training. In Proc. 53rd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO) (ed. Gizopoulos, D.) 781–795 (IEEE, 2020).

  51. Yousefzadeh, A. & Sifalakis, M. Training for temporal sparsity in deep neural networks, application in video processing. Preprint at https://arxiv.org/abs/2107.07305 (2021).

  52. Yu, L. A fast and reconfigurable sort-in-memory system based on memristors. Zenodo https://doi.org/10.5281/zenodo.15295945 (2025).

Download references

Acknowledgements

This work was supported by the National Key R&D Program of China (Grant No. 2023YFB4502200), Guangdong Provincial Key Laboratory of In-Memory Computing Chips (Grant No. 2024B1212020002), the National Natural Science Foundation of China (Grant Nos 61925401, 61927901, 8206100486, 92164302 and 92464203), Beijing Natural Science Foundation (Grant Nos L234026 and F251035) and the 111 Project (Grant No. B18001). Y.Y. acknowledges support from the Fok Ying-Tong Education Foundation.

Author information

Authors and Affiliations

Contributions

L.Y. and Y.T. designed the entire concept and experiment. T.Z. fabricated and characterized the devices. L.Y. and Y.T. were in charge of the hardware system integration, designed the experimental methodologies for each part and conducted the related data analyses. L.Y., Y.T., Z.W., X.W., Z.P., B.W., Z.J., J.L, Y.L., Z.X. and Y.Z. contributed to memristor testing, circuit design, PCB integration, hardware system verification and software simulations. L.Y. and Y.T. contributed to the interpretation of the results. L.Y., Y.T. and Y.Y. wrote the paper with input from all authors. B.Y., Y.T. and Y.Y. supervised the whole project.

Corresponding authors

Correspondence to Yaoyu Tao or Yuchao Yang.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Electronics thanks Themis Prodromakis and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Sorting example with multi-bank CA-TNS strategy.

Sorting four 4 unsigned numbers (2, 3, 9, 14) in ascending order.

Extended Data Fig. 2 Sorting example with bit-slice CA-TNS strategy.

Sorting four 4 unsigned numbers (2, 3, 9, 14) in ascending order.

Extended Data Fig. 3 Sorting example with multi-level CA-TNS strategy.

Sorting four 4 unsigned numbers (2, 3, 9, 14) in ascending order.

Supplementary information

Supplementary Information

Supplementary Sections 1–18, Figs. 1–32 and Tables 1–9.

Supplementary Video 1

CA-TNS sorting.

Supplementary Video 2

Sorting with in situ pruning.

Source data

Source Data Fig. 2

Source data for Fig. 2.

Source Data Fig. 4

Source data for Fig. 4.

Source Data Fig. 5

Source data for Fig. 5.

Source Data Fig. 6

Source data for Fig. 6.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yu, L., Zhang, T., Wang, Z. et al. A fast and reconfigurable sort-in-memory system based on memristors. Nat Electron 8, 597–609 (2025). https://doi.org/10.1038/s41928-025-01405-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue date:

  • DOI: https://doi.org/10.1038/s41928-025-01405-2

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing