Newer Qwen models are worse at summarization?
Signal
35
Hype
25
In three linesUser reports Qwen 3 (30B) outperforms newer models on summarization tasks evaluated by LLM judge, followed by Gemma 4. Suggests recent Qwen versions may be optimized for agentic tasks rather than synthesis.Read source
Your take?
Summary generated by Claude — human-verified