InfoQuant: Shaping Activation Distributions for Low-Bit LLM Quantization
Signal
82
Hype
15
In three linesInfoQuant proposes a training-free post-training quantization (PTQ) method for LLMs. It uses Peak Suppression Orthogonal Transformation (PSOT) to reshape activations into quantization-friendly distributions. On LLaMA-2 13B under W4A4KV4, it preserves 97% floating-point accuracy and reduces the performance gap by 42% over prior state-of-the-art.Read source
Your take?
Summary generated by Claude — human-verified