Back to feed
Reddit r/LocalLLaMA·

Long-context performance at lower quants

Signal
35
Hype
15
In three linesUser reports severe performance degradation of Qwen3.5 122B at Q3_K_XL quantization beyond 75-80k context tokens: hallucinations, forgetting, confusion. Asks whether issue stems from Q3 quantization or model itself, seeks llama.cpp optimizations.
Read source
Your take?
QwenOpen source

Summary generated by Claude — human-verified