Back to feed
Reddit r/LocalLLaMA·

I'm still surprised on how good the kv quantization has become

Signal
45
Hype
25
In three linesA r/LocalLLaMA user reports that KV (key-value) quantization has reached impressive quality: even with KV at q4_0 (including the drafter), the model accurately retrieves information within a 100k token context.
Read source
Your take?
Open sourceInfrastructure

Summary generated by Claude — human-verified