Reddit r/LocalLLaMA·1 June 2026

unsloth vs bartowski MTP ggufs

Signal

Hype

In three linesComparative benchmark of MTP (Multi-Token Prediction) quantizations between unsloth and bartowski on Qwen 3.5-4B, 3.5-9B, and 3.6-27B. Bartowski uses Q8_0 for MTP head (larger files). Tests for Snapdragon with Q4_0, IQ4_NL, Q4_1, MXFP4_MOE, Q8_0 limited to 24GB VRAM RTX 3090. Unsloth generally faster in decoding throughput and VRAM efficient.

Read source

Your take?

Qwen Benchmarks Code generation Open source

Summary generated by Claude — human-verified

unsloth vs bartowski MTP ggufs

Other angles on this story