Back to feed
Reddit r/LocalLLaMA·

Inferencing at 10.33 t/s on Qwen 3.5 35B on a $300 laptop

Signal
72
Hype
25
In three linesCPU inference at 10.33 tokens/s on Qwen 3.5 35B quantized Q4_K_M on $300 Lenovo Ideapad Slim 3i (i3-1215U, 8GB RAM). Uses llama.cpp with BIOS optimizations, core pinning, MTP speculative decoding, and Q8_0 K/V cache quantization.
Read source
Your take?
QwenCode generationOpen sourceInfrastructure

Summary generated by Claude — human-verified