Back to feed
Reddit r/LocalLLaMA·

Strange numbers of pp and tg rx7900xtx on ROCm and Vulcan with Qwen3.6-27b nonMTP and MTP

Signal
35
Hype
15
In three linesUser reports unsatisfactory performance running Qwen 3.6-27B on RX 7900 XTX via ROCm and Vulkan with llama.cpp. Prompt processing: 235–634 tok/s depending on backend, generation: 13–31 tok/s. MTP (speculative decoding) n=3 drops generation to 17 tok/s despite 78% acceptance rate.
Read source
Your take?
QwenOpen sourceBenchmarksInfrastructure

Summary generated by Claude — human-verified