RaBiT: Residual-Aware Binarization Training for Accurate and Efficient LLMs
Signal
78
Hype
22
In three linesRaBiT introduces residual-aware binarization training for 2-bit LLM quantization. It solves inter-path feature co-adaptation by sequentially deriving each binary path from a shared full-precision weight, ensuring error correction across layers. Results: SOTA performance, 4.49× inference speedup on RTX 4090.Read source
Your take?
Summary generated by Claude — human-verified