GAMMA: Global Bit Allocation for Mixed-Precision Models under Arbitrary Budgets
Signal
82
Hype
18
In three linesGAMMA is a quantizer-agnostic mixed-precision framework that automatically allocates bit precision per module without quantization-aware training. Using teacher-forced hidden-state reconstruction and integer programming, it achieves +12.99 Avg. over fixed baselines on Llama/Qwen 8B-32B, matching 3-bit quality at 2.5-bit average.Read source
Your take?
Summary generated by Claude — human-verified