MARR: Module-Adaptive Residual Reconstruction for Low-Bit Post-Training Quantization
Signal
72
Hype
18
In three linesMARR introduces a low-bit post-training quantization method (≤4-bit) for LLMs and Vision Transformers using module-specific scaling coefficients to balance accumulated-error correction and residual-induced bias, with a PID-based adaptive update strategy. Achieves up to 20.2% gains on LLMs and 4.6% on ViTs over prior residual reconstruction methods.Read source
Your take?
Summary generated by Claude — human-verified