arXiv cs.LG·4 June 2026

Recover-LoRA for Aggressive Quantization: Reclaiming Accuracy in 2-Bit Language Models via Low-Rank Adaptation with Knowledge Distillation on Synthetic Data

Signal

Hype

In three linesRecover-LoRA extends a data-free accuracy recovery method to 2-bit quantized LLMs. A mixed-precision strategy selectively quantizes MLP gate/up layers to W2 while keeping others at W4, achieving 7.5–23.3% throughput gains. Low-rank adapters trained via logit distillation on synthetic data recover 80–95% accuracy on Qwen3-4B using only 10k samples.

Read source

Your take?

Fine-tuning Benchmarks

Summary generated by Claude — human-verified

Recover-LoRA for Aggressive Quantization: Reclaiming Accuracy in 2-Bit Language Models via Low-Rank Adaptation with Knowledge Distillation on Synthetic Data

Other angles on this story