Reddit r/LocalLLaMA·5 June 2026

Benchmark & Reality Check on Gemma 4 12B: Great model, but your local settings are probably breaking it (Fix inside)

Signal

Hype

In three linesBenchmark of Gemma 4 12B on Python bug-hunting task. Model finds 6 bugs vs 14 for Qwen 35B. Default LM Studio settings disable reasoning. Fix: enable enable_thinking in Jinja template, set thought tokens (<|channel>thought / <channel|>), use temperature 1.0, top_p 0.95, top_k 64.

Read source

Your take?

Gemini Benchmarks Reasoning Code generation Tools

Summary generated by Claude — human-verified

Benchmark & Reality Check on Gemma 4 12B: Great model, but your local settings are probably breaking it (Fix inside)

Other angles on this story