arXiv cs.LG·3 June 2026

Qift: Shift-Friendly No-Zero W2 Post-Training Quantization for Rotated W2A4/KV4 LLM Inference

Signal

Hype

In three linesQift introduces a zero-free level set for W2A4/KV4 quantization ({±0.5, ±1.5}) leveraging Hadamard rotation. Training-free and codebook-free, it improves perplexity on LLaMA-2-7B and LLaMA-3.1-8B versus standard levels {-2,-1,0,+1}, narrowing the gap to W3A4 in mixed precision while keeping half transformer layers at two-bit.

Read source

Your take?

Llama Benchmarks

Summary generated by Claude — human-verified

Qift: Shift-Friendly No-Zero W2 Post-Training Quantization for Rotated W2A4/KV4 LLM Inference

Other angles on this story