Back to feed
Reddit r/LocalLLaMA·

Gemma4 26b a4b Apex quant is quite good

Signal
65
Hype
25
In three linesUser tests APEX quantization of Gemma 4 26B on AMD RX 9060 XT GPU. Achieves 38 tokens/sec at 90k context with no quality degradation using llama.cpp Vulkan. APEX-I-Compact model (15GB) outperforms previous Q5 quant (21.2GB) which looped at 50k context.
Read source
Your take?
GeminiOpen source

Summary generated by Claude — human-verified