Reddit r/LocalLLaMA·19 May 2026

Floor for local meeting summarization on a 6GB GPU: qwen3.5:0.8b works at 57s, Granite 4 350M hallucinates

Signal

Hype

In three linesBenchmark of small models for local meeting summarization on 6GB GPU. Qwen3.5:0.8b produces structured summary in 57s using 2.2GB VRAM. Granite 4 350M is faster (0.6-2.8s) but hallucinates (invents topics, confuses entities).

Read source

Your take?

Qwen Code generation Benchmarks Open source Tools

Summary generated by Claude — human-verified

Floor for local meeting summarization on a 6GB GPU: qwen3.5:0.8b works at 57s, Granite 4 350M hallucinates

Other angles on this story