Reddit r/LocalLLaMA·19 May 2026

Qwen3.6:27B VRAM 16GB 5080: MTP Quant, Speeds, and Configs

Signal

Hype

In three linesUser shares Qwen3.6-27B-Q3_K_S configuration on 16GB VRAM with RTX 5080. Achieves 47-61 tokens/s generation and 1095-1426 tokens/s prompt eval. Uses Q3_K_S quantization, 64 GPU layers, MTP speculative decoding with 0.59-0.80 draft acceptance rate.

Read source

Your take?

Qwen Code generation Fine-tuning Infrastructure

Summary generated by Claude — human-verified

Qwen3.6:27B VRAM 16GB 5080: MTP Quant, Speeds, and Configs

Other angles on this story