Reddit r/LocalLLaMA·4 June 2026

Qwen3.6 27B collapse in performance for agentic coding

Signal

Hype

In three linesUser reports severe performance degradation with Qwen 3.6 27B Q4_K_XL quantization on RX 7900 XTX with llama.cpp: prompt processing speed drops from 161 tokens/s (2048 tokens) to 20 tokens/s (12288 tokens). Setup: ctx-size 90000, flash-attn enabled, all layers in VRAM.

Read source

Your take?

Qwen Code generation AI Agents Infrastructure

Summary generated by Claude — human-verified

Qwen3.6 27B collapse in performance for agentic coding

Other angles on this story