Reddit r/LocalLLaMA·14 June 2026

Strange numbers of pp and tg rx7900xtx on ROCm and Vulcan with Qwen3.6-27b nonMTP and MTP

Signal

Hype

In three linesUser reports unsatisfactory performance running Qwen 3.6-27B on RX 7900 XTX via ROCm and Vulkan with llama.cpp. Prompt processing: 235–634 tok/s depending on backend, generation: 13–31 tok/s. MTP (speculative decoding) n=3 drops generation to 17 tok/s despite 78% acceptance rate.

Read source

Your take?

Qwen Open source Benchmarks Infrastructure

Summary generated by Claude — human-verified

Strange numbers of pp and tg rx7900xtx on ROCm and Vulcan with Qwen3.6-27b nonMTP and MTP

Other angles on this story