Floor for local meeting summarization on a 6GB GPU: qwen3.5:0.8b works at 57s, Granite 4 350M hallucinates
Signal
72
Hype
15
In three linesBenchmark of small models for local meeting summarization on 6GB GPU. Qwen3.5:0.8b produces structured summary in 57s using 2.2GB VRAM. Granite 4 350M is faster (0.6-2.8s) but hallucinates (invents topics, confuses entities).Read source
Your take?
Summary generated by Claude — human-verified