Qift: Shift-Friendly No-Zero W2 Post-Training Quantization for Rotated W2A4/KV4 LLM Inference
Signal
72
Hype
18
In three linesQift introduces a zero-free level set for W2A4/KV4 quantization ({±0.5, ±1.5}) leveraging Hadamard rotation. Training-free and codebook-free, it improves perplexity on LLaMA-2-7B and LLaMA-3.1-8B versus standard levels {-2,-1,0,+1}, narrowing the gap to W3A4 in mixed precision while keeping half transformer layers at two-bit.Read source
Your take?
Summary generated by Claude — human-verified