Quantized Reasoning Models Think They Need to Think Longer, but They Do Not
Signal
78
Hype
15
In three linesPost-training quantization (PTQ) reduces reasoning model accuracy and increases chain-of-thought length. 52% of failures involve correct intermediate answers not output as final answers. A training-free logit penalty on overthinking markers ("wait", "but", "alternatively") reduces CoT length by 12-23% while preserving accuracy across 5 models (1.5B-32B) and 5 benchmarks.Read source
Your take?
Summary generated by Claude — human-verified