Table 1 Performance comparison of our proposed OCPU framework

Type	Programmable units	Matrix dimension	Platform	Accuracy on MNIST test set	Network architecture	Efficiency (/MAC)	Precision of results	Compute density (MACs/s/mm²)	Scale
TOPS-CA³⁵	/	9 × 10	/	88.00%	1Conv. (3 5 × 5 kernels) + 1FC	1.58 pJ	7-bit	System	/
Netcast⁵²	/	/	/	98.80%	3FC	10.00 fJ^e	8-bit	System	/
AOM-VMM⁵³	3	1 × 3	/	98.90%	2Conv. (16 3 × 3 kernels) + 2FC	/	/	System	N²
MZI-VMM²⁵	60	4 × 4	Si	76.70% (4 categories, vowel recognition)	2FC	30.00 fJ^a	5-bit	0.56 T^a	N²
MRR-VMM³³	16	4 × 4	Si	/	/	0.18 pJ^b	4-bit^f	1.60 T^b	N²
MRR-VMM⁵¹	4	1 × 4	Si	97.41%	3FC	0.56 pJ^b	4-bit	2.89 T^b	N²
PCM-VMM¹⁶	36	9 × 4	SiN	95.30%	1Conv. (4 2 × 2 kernels) + 1FC	5.00 pJ	7-bit^d	0.60 T	N²
PCM-VMM¹⁶	64	8 × 8	Si	/	/	4.00 pJ	7-bit^d	81.00 T	N²
PMMC-VMM⁵⁴	4	1 × 4	SiN	91.00% (2 categories)	1Conv. (2 2 × 2 kernels) + 1FC	/	6-bit^f	82.00 T^b	N²
IDNN-VMM²⁴	20	10 × 10^c	Si	89.40%	2FC	/	/	/	2N
Flash (analog electronics, simulation)⁵⁵	/	100 × 100	Si	/	/	7.00 fJ	5-bit	18.00 T	/
This work	4	4 × 4^c	SiN	92.17%	1Conv. (2 2 × 2 kernels) + 1FC	4.84 pJ	5-bit	12.74 T	N
Expected from this work	9	9 × 9^c	Si	96.35%	1Conv. (8 3 × 3 kernels) + 1FC	0.95 pJ	5-bit	1.19 P	N

^aThese data can be obtained based on existing state-of-the-art equipment.
^bData derived from a large-scale outlook of the proposed structure.
^cThe rows in the matrix are correlated to each other.
^dFor comparison under the same standard, the precision is recalculated following the standard deviation listed in the paper.
^eEnergy efficiency of the client.
^fPrecision of weight adjustment.

Quick links

Search