Back to feed
arXiv cs.LG·

InfoQuant: Shaping Activation Distributions for Low-Bit LLM Quantization

Signal
82
Hype
15
In three linesInfoQuant proposes a training-free post-training quantization (PTQ) method for LLMs. It uses Peak Suppression Orthogonal Transformation (PSOT) to reshape activations into quantization-friendly distributions. On LLaMA-2 13B under W4A4KV4, it preserves 97% floating-point accuracy and reduces the performance gap by 42% over prior state-of-the-art.
Read source
Your take?
LlamaPapersBenchmarks

Summary generated by Claude — human-verified