Back to feed
arXiv cs.LG·

Recover-LoRA for Aggressive Quantization: Reclaiming Accuracy in 2-Bit Language Models via Low-Rank Adaptation with Knowledge Distillation on Synthetic Data

Signal
78
Hype
18
In three linesRecover-LoRA extends a data-free accuracy recovery method to 2-bit quantized LLMs. A mixed-precision strategy selectively quantizes MLP gate/up layers to W2 while keeping others at W4, achieving 7.5–23.3% throughput gains. Low-rank adapters trained via logit distillation on synthetic data recover 80–95% accuracy on Qwen3-4B using only 10k samples.
Read source
Your take?
Fine-tuningBenchmarks

Summary generated by Claude — human-verified