A carbon-nanotube-based tensor processing unit

Si, Jia; Zhang, Panpan; Zhao, Chenyi; Lin, Dongyi; Xu, Lin; Xu, Haitao; Liu, Lijun; Jiang, Jianhua; Peng, Lian-Mao; Zhang, Zhiyong

doi:10.1038/s41928-024-01211-2

Article
Published: 22 July 2024

A carbon-nanotube-based tensor processing unit

Nature Electronics volume 7, pages 684–693 (2024)Cite this article

6215 Accesses
22 Citations
83 Altmetric
Metrics details

Subjects

Abstract

The growth of data-intensive computing tasks requires processing units with higher performance and energy efficiency, but these requirements are increasingly difficult to achieve with conventional semiconductor technology. One potential solution is to combine developments in devices with innovations in system architecture. Here we report a tensor processing unit (TPU) that is based on 3,000 carbon nanotube field-effect transistors and can perform energy-efficient convolution operations and matrix multiplication. The TPU is constructed with a systolic array architecture that allows parallel 2 bit integer multiply–accumulate operations. A five-layer convolutional neural network based on the TPU can perform MNIST image recognition with an accuracy of up to 88% for a power consumption of 295 µW. We use an optimized nanotube fabrication process that offers a semiconductor purity of 99.9999% and ultraclean surfaces, leading to transistors with high on-current densities and uniformity. Using system-level simulations, we estimate that an 8 bit TPU made with nanotube transistors at a 180 nm technology node could reach a main frequency of 850 MHz and an energy efficiency of 1 tera-operations per second per watt.

Access through your institution

Buy or subscribe

This is a preview of subscription content, access via your institution

Access options

Access through your institution

Buy this article

Purchase on SpringerLink
Instant access to full article PDF

Buy now

Prices may be subject to local taxes which are calculated during checkout

**Fig. 1: CNT FET-based digit computing system for tensor processing acceleration.**

**Fig. 2: Electrical characteristics of a top-gated p-FET and basic logic gates.**

**Fig. 3: PE and data flow for convolution in the CNT TPU.**

**Fig. 4: Image edge extraction with single and combined kernels.**

**Fig. 5: Five-layer CNN with the CNT TPU, performance metrics and comparison of different systems.**

**Fig. 6: Matrix multiplication by the CNT TPU.**

Ultra-low power carbon nanotube/porphyrin synaptic arrays for persistent photoconductivity and neuromorphic computing

Article Open access 21 July 2024

Nanonet: Low-temperature-processed tellurium nanowire network for scalable p-type field-effect transistors and a highly sensitive phototransistor array

Article Open access 28 May 2021

Thermoplastic polyurethane flexible capacitive proximity sensor reinforced by CNTs for applications in the creative industries

Article Open access 13 January 2021

Data availability

The data that support the plots within this paper and the other findings of this study are available from the corresponding author upon reasonable request.

Code availability

The custom codes for this study are available from the corresponding author upon request.

References

Ionescu, A. M. Energy efficient computing and sensing in the zettabyte era: from silicon to the cloud. In Proc. 2017 IEEE International Electron Devices Meeting (IEDM) 1.2.1–1.2.8 (IEEE, 2017); https://doi.org/10.1109/IEDM.2017.8268307
Li, H., Ota, K. & Dong, M. Learning IoT in edge: deep learning for the Internet of Things with edge computing. IEEE Netw. 32, 96–101 (2018).
Article Google Scholar
Service, R. F. Is silicon’s reign nearing its end? Science 323, 1000–1002 (2009).
Article Google Scholar
Markov, I. L. Limits on fundamental limits to computation. Nature 512, 147–154 (2014).
Article Google Scholar
Dean, J., Patterson, D. & Young, C. A new golden age in computer architecture: empowering the machine-learning revolution. IEEE Micro 38, 21–29 (2018).
Article Google Scholar
Qiu, C. et al. Scaling carbon nanotube complementary transistors to 5-nm gate lengths. Science 355, 271–276 (2017).
Article Google Scholar
Liu, L. et al. Aligned, high-density semiconducting carbon nanotube arrays for high-performance electronics. Science 368, 850–856 (2020).
Article Google Scholar
Hills, G. et al. Modern microprocessor built from complementary carbon nanotube transistors. Nature 572, 595–602 (2019).
Article Google Scholar
Franklin, A. D. et al. Sub-10 nm carbon nanotube transistor. Nano Lett. 12, 758–762 (2012).
Article Google Scholar
Sabry Aly, M. M. et al. The N3XT approach to energy-efficient abundant-data computing. Proc. IEEE 107, 19–48 (2019).
Article Google Scholar
Shulaker, M. M. et al. Three-dimensional integration of nanotechnologies for computing and data storage on a single chip. Nature 547, 74–78 (2017).
Article Google Scholar
Gomez-Luna, J. et al. Benchmarking memory-centric computing systems: analysis of real processing-in-memory hardware. In Proc. 12th International Green and Sustainable Computing Conference (IGSC) 1–7 (IEEE, 2021); https://doi.org/10.1109/IGSC54211.2021.9651614
Mutlu, O., Ghose, S., Gómez-Luna, J. & Ausavarungnirun, R. Processing data where it makes sense: enabling in-memory computation. Microprocess. Microsyst. 67, 28–41 (2019).
Article Google Scholar
Kang, M., Keel, M.-S., Shanbhag, N. R., Eilert, S. & Curewitz, K. An energy-efficient VLSI architecture for pattern recognition via deep embedding of computation in SRAM. In Proc. 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 8326–8330 (IEEE, 2014); https://doi.org/10.1109/ICASSP.2014.6855225
Fujiki, D., Mahlke, S. & Das, R. Duality cache for data parallel acceleration. In Proc. ACM/IEEE 46th Annual International Symposium on Computer Architecture (ISCA) 1–14 (IEEE, 2019).
Seshadri, V. et al. Ambit: in-memory accelerator for bulk bitwise operations using commodity DRAM technology. In Proc. 50th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO) 273–287 (ACM, 2017).
Seshadri, V. et al. RowClone: accelerating data movement and initialization using DRAM. Preprint at https://doi.org/10.48550/arXiv.1805.03502 (2018).
Gómez-Luna, J. et al. Benchmarking a new paradigm: experimental analysis and characterization of a real processing-in-memory system. IEEE Access 10, 52565–52608 (2022).
Article Google Scholar
memBrain^TM Products. Silicon Storage Technology https://www.sst.com/membraintm-products
Mahmoodi, M. R. & Strukov, D. An ultra-low energy internally analog, externally digital vector-matrix multiplier based on NOR flash memory technology. In Proc. 55th Annual Design Automation Conference 1–6 (ACM, 2018); https://doi.org/10.1145/3195970.3195989
Wong, H.-S. P. & Salahuddin, S. Memory leads the way to better computing. Nat. Nanotechnol. 10, 191–194 (2015).
Article Google Scholar
Sebastian, A., Le Gallo, M., Khaddam-Aljameh, R. & Eleftheriou, E. Memory devices and applications for in-memory computing. Nat. Nanotechnol. 15, 529–544 (2020).
Article Google Scholar
Yao, P. et al. Fully hardware-implemented memristor convolutional neural network. Nature 577, 641–646 (2020).
Article Google Scholar
Cai, F. et al. A fully integrated reprogrammable memristor–CMOS system for efficient multiply–accumulate operations. Nat. Electron. 2, 290–299 (2019).
Article Google Scholar
Lin, P. et al. Three-dimensional memristor circuits as complex neural networks. Nat. Electron. 3, 225–232 (2020).
Article Google Scholar
Zidan, M. A. et al. A general memristor-based partial differential equation solver. Nat. Electron. 1, 411–420 (2018).
Article Google Scholar
Hung, J.-M. et al. A four-megabit compute-in-memory macro with eight-bit precision based on CMOS and resistive random-access memory for AI edge devices. Nat. Electron. 4, 921–930 (2021).
Article Google Scholar
Joshi, V. et al. Accurate deep neural network inference using computational phase-change memory. Nat. Commun. 11, 2473 (2020).
Article Google Scholar
Ambrogio, S. et al. Equivalent-accuracy accelerated neural-network training using analogue memory. Nature 558, 60–67 (2018).
Article Google Scholar
Nandakumar, S. R. et al. Experimental demonstration of supervised learning in spiking neural networks with phase-change memory synapses. Sci. Rep. 10, 8080 (2020).
Article Google Scholar
Sarwat, S. G., Kersting, B., Moraitis, T., Jonnalagadda, V. P. & Sebastian, A. Phase-change memtransistive synapses for mixed-plasticity neural computations. Nat. Nanotechnol. 17, 507–513 (2022).
Berdan, R. et al. Low-power linear computation using nonlinear ferroelectric tunnel junction memristors. Nat. Electron. 3, 259–266 (2020).
Article Google Scholar
Shi, Y. et al. Neuroinspired unsupervised learning and pruning with subquantum CBRAM arrays. Nat. Commun. 9, 5312 (2018).
Article Google Scholar
Mennel, L. et al. Ultrafast machine vision with 2D material neural network image sensors. Nature 579, 62–66 (2020).
Article Google Scholar
Jung, S. et al. A crossbar array of magnetoresistive memory devices for in-memory computing. Nature 601, 211–216 (2022).
Article Google Scholar
Torrejon, J. et al. Neuromorphic computing with nanoscale spintronic oscillators. Nature 547, 428–431 (2017).
Article Google Scholar
Raina, R., Madhavan, A. & Ng, A. Y. Large-scale deep unsupervised learning using graphics processors. In Proc. 26th Annual International Conference on Machine Learning – ICML ’09 1–8 (ACM, 2009); https://doi.org/10.1145/1553374.1553486
Wang, C. et al. DLAU: a scalable deep learning accelerator unit on FPGA. IEEE Trans. Comput.-Aided Des. Integr. Circuits Syst. 36, 513–517 (2016).
Google Scholar
Chen, Y., Chen, T., Xu, Z., Sun, N. & Temam, O. DianNao family: energy-efficient hardware accelerators for machine learning. Commun. ACM 59, 105–112 (2016).
Article Google Scholar
Jouppi, N. P. et al. In-datacenter performance analysis of a tensor processing unit. In Proc. ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA) 1–12 (IEEE, 2017).
Kung, H. T. Why systolic architectures?. Computer 15, 37–46 (1982).
Article Google Scholar
Jouppi, N. P. et al. Ten lessons from three generations shaped Google’s tpuv4i: industrial product. In Proc. ACM/IEEE 48th Annual International Symposium on Computer Architecture (ISCA) 1–14 (IEEE, 2021).
Hu, Y. H. & Kung, S.-Y. in Handbook of Signal Processing Systems (eds. Bhattacharyya, S. S. et al.) 817–849 (Springer, 2010); https://doi.org/10.1007/978-1-4419-6345-1_29
Gysel, P., Motamedi, M. & Ghiasi, S. Hardware-oriented approximation of convolutional neural networks. Preprint at arxiv.org/abs/1604.03168 (2016).
Khwa, W.-S. et al. A 65nm 4kb algorithm-dependent computing-in-memory SRAM unit-macro with 2.3ns and 55.8TOPS/W fully parallel product-sum operation for binary DNN edge processors. In Proc. 2018 IEEE International Solid - State Circuits Conference - (ISSCC) 496–498 (IEEE, 2018); https://doi.org/10.1109/ISSCC.2018.8310401
Tang, J. et al. A reliable all-2D materials artificial synapse for high energy-efficient neuromorphic computing. Adv. Funct. Mater. 31, 2011083 (2021).
Article Google Scholar
Liu, C. et al. Complementary transistors based on aligned semiconducting carbon nanotube arrays. ACS Nano 16, 21482–21490 (2022).
Article Google Scholar
Zhang, Z. et al. Complementary carbon nanotube metal–oxide–semiconductor field-effect transistors with localized solid-state extension doping. Nat. Electron. 6, 999–1008 (2023).
Article Google Scholar
Zhao, C. et al. Exploring the performance limit of carbon nanotube network film field-effect transistors for digital integrated circuit applications. Adv. Funct. Mater. 29, 1808574 (2019).
Lin, Y. et al. Enhancement-mode field-effect transistors and high-speed integrated circuits based on aligned carbon nanotube films. Adv. Funct. Mater. 32, 2104539 (2022).
Article Google Scholar
Lee, C.-S., Pop, E., Franklin, A. D., Haensch, W. & Wong, H.-S. P. A compact virtual-source model for carbon nanotube FETs in the sub-10-nm regime—Part I. Intrinsic elements. IEEE Trans. Electron. Devices 62, 3061–3069 (2015).
Article Google Scholar

Download references

Acknowledgements

This work was supported by the National Key Research and Development Program of China (Project Nos. 2022YFB4401600 and 2021YFA1202904 to Z.Z.), the Natural Science Foundation of China (Project No. 62274006 to J.S. and Project Nos. 62225101 and U21A6004 to Z.Z.) and Peking Nanofab.

Author information

These authors contributed equally: Jia Si, Panpan Zhang.

Authors and Affiliations

Key Laboratory for the Physics and Chemistry of Nanodevices and Center for Carbon-based Electronics, School of Electronics, Peking University, Beijing, China
Jia Si, Chenyi Zhao, Lin Xu, Lijun Liu, Jianhua Jiang, Lian-Mao Peng & Zhiyong Zhang
State Key Laboratory of Information Photonics and Optical Communications, Beijing University of Posts and Telecommunications, Beijing, China
Panpan Zhang
Hunan Institute of Advanced Sensing and Information Technology, Xiangtan University, Xiangtan, China
Dongyi Lin & Zhiyong Zhang
Beijing Institute of Carbon-based Integrated Circuits, Beijing, China
Haitao Xu, Lian-Mao Peng & Zhiyong Zhang

Authors

Jia Si
View author publications
Search author on:PubMed Google Scholar
Panpan Zhang
View author publications
Search author on:PubMed Google Scholar
Chenyi Zhao
View author publications
Search author on:PubMed Google Scholar
Dongyi Lin
View author publications
Search author on:PubMed Google Scholar
Lin Xu
View author publications
Search author on:PubMed Google Scholar
Haitao Xu
View author publications
Search author on:PubMed Google Scholar
Lijun Liu
View author publications
Search author on:PubMed Google Scholar
Jianhua Jiang
View author publications
Search author on:PubMed Google Scholar
Lian-Mao Peng
View author publications
Search author on:PubMed Google Scholar
Zhiyong Zhang
View author publications
Search author on:PubMed Google Scholar

Contributions

Z.Z. and L.-M.P. proposed and supervised the project. J.S. designed the system and circuit and performed the fabrication and measurements. P.Z., D.L. and J.J. performed the simulations. J.S., C.Z., H.X., L.X. and L.L. analysed the data. J.S., Z.Z. and L.-M.P. co-wrote the manuscript. All authors discussed the results and commented on the manuscript.

Corresponding authors

Correspondence to Lian-Mao Peng or Zhiyong Zhang.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Electronics thanks Franz Kreupl, Jinbo Pang and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Supplementary Figs. 1–9 and Tables 1–3.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Si, J., Zhang, P., Zhao, C. et al. A carbon-nanotube-based tensor processing unit. Nat Electron 7, 684–693 (2024). https://doi.org/10.1038/s41928-024-01211-2

Download citation

Received: 31 May 2022
Accepted: 23 June 2024
Published: 22 July 2024
Issue date: August 2024
DOI: https://doi.org/10.1038/s41928-024-01211-2

This article is cited by

Advanced Design for High-Performance and AI Chips
- Ying Cao
- Yuejiao Chen
- Bingang Xu
Nano-Micro Letters (2026)
Nano-seeding catalysts for high-density arrays of horizontally aligned carbon nanotubes with wafer-scale uniformity
- Ying Xie
- Yue Li
- Jin Zhang
Nature Communications (2025)
Hardware accelerators based on nanotube transistors
- Kaixiang Kang
- Lingzhi Wu
- Jianwen Zhao
Nature Electronics (2024)
Manufacturing carbon nanotube transistors using lift-off process: limitations and prospects
- Xilong Gao
- Jia Si
- Zhiyong Zhang
Moore and More (2024)

A carbon-nanotube-based tensor processing unit

Subjects

Abstract

Access options

Similar content being viewed by others

Ultra-low power carbon nanotube/porphyrin synaptic arrays for persistent photoconductivity and neuromorphic computing

Nanonet: Low-temperature-processed tellurium nanowire network for scalable p-type field-effect transistors and a highly sensitive phototransistor array

Thermoplastic polyurethane flexible capacitive proximity sensor reinforced by CNTs for applications in the creative industries

Data availability

Code availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding authors

Ethics declarations

Competing interests

Peer review

Peer review information

Additional information

Supplementary information

Supplementary Information

Rights and permissions

About this article

Cite this article

This article is cited by

Advanced Design for High-Performance and AI Chips

Nano-seeding catalysts for high-density arrays of horizontally aligned carbon nanotubes with wafer-scale uniformity

Hardware accelerators based on nanotube transistors

Manufacturing carbon nanotube transistors using lift-off process: limitations and prospects

Hardware accelerators based on nanotube transistors

Search

Quick links

Subjects

Abstract

Access options

Similar content being viewed by others

Data availability

Code availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding authors

Ethics declarations

Competing interests

Peer review

Peer review information

Additional information

Supplementary information

Rights and permissions

About this article

Cite this article

Share this article

This article is cited by

Search

Quick links