Author: Arun Nanda
-
Understanding post-training quantization, quantization-aware training, and the straight through estimator
11 min read -
Reducing high-precision floating-point weights to low-precision integer weights
13 min read -
This introductory article gives an overview of different approaches to reduce model size. It introduces…
11 min read