Back to feed
Reddit r/LocalLLaMA·

OSCAR RotationZoo - Offline Spectral Covariance-Aware Rotation for 2-bit KV Cache Quantization

Signal
78
Hype
15
In three linesOSCAR RotationZoo provides precomputed rotation matrices for INT2 KV-cache quantization. The method achieves ~7× KV-cache memory compression with single-digit accuracy drop on GPQA for dense reasoning models (Qwen3-4B, Qwen3-8B, GLM-4.7). Code and rotations available on HuggingFace.
Read source
Your take?
BenchmarksOpen sourceQwen

Summary generated by Claude — human-verified