Reddit r/LocalLLaMA·20 May 2026

Qwen 3.6 35B GGUF: NTP vs MTP quantization results across GPUs and CPUs

Signal

Hype

In three linesByteShape releases Qwen 3.6 35B GGUF quantizations in NTP and MTP variants. Benchmarked across RTX 4090/5090, Intel i7/Ultra 7, Ryzen 9, Raspberry Pi 5. Key finding: largest model stays competitive on quality and speed. MTP delivers 20–40% GPU generation speedup but increases memory footprint. NTP recommended for CPU.

Read source

Your take?

Qwen Open source Benchmarks Infrastructure

Summary generated by Claude — human-verified

Qwen 3.6 35B GGUF: NTP vs MTP quantization results across GPUs and CPUs

Other angles on this story