Back to feed
arXiv cs.LG·

QAM-W: Joint 2D Codebook Quantization for LLM Weights via Hadamard Rotation and Activation-Aware Scaling

Signal
78
Hype
15
In three linesQAM-W is a 2D quantization codec for LLM weights using Hadamard rotation and activation-aware scaling. Across 5 models (1.1B–13B), the activation-aware variant at ~5.5 bpw maintains ±0.4% BF16 perplexity, matching SmoothQuant W8A8 quality with 32% fewer weight bits. 2D coding outperforms polar coding by 2–15 pp.
Read source
Your take?
Fine-tuningBenchmarksPapers

Summary generated by Claude — human-verified