Table 1 Electro-photonic DNN accelerators and their operational efficiency over CMOS-based accelerators

From: Photonics for sustainable AI

Accelerator

Simulations

Improvements over CMOS-based accelerators

ADEPT19

Workloads: ResNet-50, BERT-large, and RNN-T inference

Comparison: ADEPT 128 × 128 core running at 10 GHz with 10 128 × 128 systolic arrays running at 1 GHz

Average 5.73 × higher throughput-per-Watt than systolic arrays

Albireo18

Workloads: Alexnet and VGG-16 inference

Comparison: Albireo with popular CMOS-based accelerators

Average 110 × lower latency and 74.2 × lower EDP than CMOS-based accelerators

DEAP-CNN21

Workloads: Inference of different convolution layers

Comparison: DEAP-CNN with various types of GPUs

2.8–14 × faster with  ~ 25% less energy than GPUs

RecLight22

Workloads: Several RNN, GRU, and LSTM model inference

Comparison: RecLight with state-of-the-art CMOS-based RNN accelerators

37× lower energy-per-bit and 10% better throughput than CMOS-based RNN accelerators

LT23

Workloads: Variants of BERT and DeiT transformer model inference

Comparison: LT with several electronic hardware, including CPU, GPU, TPU, and FPGA-based accelerators

2–3 orders of magnitude lower energy-delay-product compared to the electronic accelerators

Mirage24

Workloads: Training of different DNN models, including Alexnet, VGG-16, ResNet-50, YOLOv2, and transformers.

Comparison: Mirage with systolic arrays

Achieves FP32-level accuracy in DNN training by utilizing residue number system. Trains 23.8 × faster and reduces EDP by 32.1 × compared to systolic arrays in an iso-energy scenario, while consuming 42.8 × less power with similar or better EDP in an iso-area scenario.