Table 3 Performance comparison (throughput -Tera OPerations per Second, TOPS-, Density and Efficiency) between hybrid CMOS/Hybrid prototypes and full-CMOS neuromorphic accelerators

From: Hardware implementation of memristor-based artificial neural networks

 

Exp./Sim

Type

Process (nm)

Activation resolution

Weight resolution

Clock speed

Benchmarked workload

Weight storage

Rhigh

Rlow

Array size

ADC type

Throughput (TOPS)

Density (TOPS per mm2)

Efficiency (TOPS per W)

NVIDIA T4277

Exp.

Full-CMOS

12

8-bit int

8-bit int

2.6 GHz

ResNet-50 (batch = 128)

--

--

--

--

--

22.2, 130 (peak)

0.04, 0.24 (peak)

0.32

Google TPU v119

Exp.

Full-CMOS

28

8-bit int

8-bit int

700 MHz

MLPs, LSTMs, CNNs

--

--

--

--

--

21.4, 92 (peak)

0.06, 0.28 (peak)

2.3 (peak)

Habana Goya HL-1000278

Exp.

Full-CMOS

16

16-bit int

16-bit int

2.1 GHz (CPU)

ResNet-50 (batch = 10)

--

--

--

--

--

63.1

--

0.61

DaDianNao279

Sim.

Full-CMOS

28

16-bit fixed-pt.

16-bit fixed-pt.

606 MHz

Peak performance

--

--

--

--

--

5.58

0.08

0.35

UNPU280

Exp.

Full-CMOS

65

16 bits

1 bit

200 MHz

Peak performance

--

--

--

--

--

7.37

0.46

50.6

Reference mixed-signal281

Exp.

Full-CMOS

28

1 bit

1 bit

10 MHz

Binary CNN (CIFAR-10)

--

--

--

--

--

0.478

0.1

532

ISAAC160

Exp.

RRAM-CMOS

32

16 bits

16 bits

1.2 GHz

Peak performance

ReRAM (8×2-bit)

~2 M

~2 k

128×128

SAR (8-bit)

41.3

0.48

0.63

Newton282

Exp.

RRAM-CMOS

32

16 bits

16 bits

1.2 GHz

Peak performance

ReRAM (8×2-bit)

~2 M

~2 k

128×128

SAR (8-bit)

--

0.68

0.92

PUMA154

Exp.

RRAM-CMOS

32

16 bits

16 bits

1.0 GHz

Peak performance

ReRAM (8×2-bit)

1 M

100k

128×128

SAR

26.2

0.29

0.42

PRIME125

Sim.

RRAM-CMOS

65

6 bits

8 bits

3.0 GHz (CPU)

--

ReRAM

20 k

1 k

256×256

Ramp (6-bit)

--

--

--

Memristive Boltzmann machine283

Sim.

RRAM-CMOS

22

32 bits

32 bits

3.2 GHz (CPU)

--

ReRAM

1.1 G

315 k

512×512

SAR

--

--

--

3D-aCortex83

Exp.

RRAM-CMOS

55

4 bits

4 bits

1.0 GHz

GNMT

NAND flash

--

2.3 M

64×128

Temporal to digital (4-bit)

10.7

0.58

70.4

Analog-AI Using Dense 2-D Mesh284

Sim

RRAM-CMOS

14

8 bits

Analogue

1.0 GHz

RNN/LSTM

PCM

No data

No data

512×512

Current controlled oscillator based

376.7

No data

65.6

  1. Adapted from with permission under CC BY 4.0 license from ref. 276.