Neural networks are often implemented with reduced precision in order to meet the tight energy and memory budget required by edge computing devices. Chakraborty et al. develop a technique for assessing which layers can be quantized, and by how much, without sacrificing too much on performance.
- Indranil Chakraborty
- Deboleena Roy
- Kaushik Roy