Skip to content

Blogs

AWQ: Activation-Aware Weight Quantization

AWQ is a weight-only quantization method (activations are kept in full precision because they serve as the "budget" traded away to protect salient weights — quantizing them too would undermine the very scaling trick that makes AWQ work) that achieves strong results through a surprisingly simple insight: not all weights matter equally, and the ones that do can be identified by looking at activations, not weights.