Edition of2026-05-24

Claude Code discovers a reasoning algorithm for $40 — cuts compute 70% vs. standard self-consistency

The strongest signal today comes from UMD/Google/Meta: by letting Claude Code run freely via AutoTTS, researchers obtained a reasoning control algorithm that no one would likely have designed by hand. The result is a 70% reduction in compute versus standard self-consistency, with accuracy preserved, at a total experiment cost of $40 in 160 minutes. This isn't another benchmark — it's a demonstration that coding agents can now produce non-trivial research contributions on prototyping budgets. The immediate follow-up question: how many "suboptimal" algorithms in the current literature would survive a systematic AutoTTS audit?

Meanwhile, an independent benchmark across 30 long PDFs (171 questions, MMLongBench-Doc) resets assumptions about vision vs. OCR for document RAG. Claude Sonnet 4.5 in native vision mode caps at 52% accuracy for $0.2552/query — more expensive and less accurate than LlamaCloud premium + OCR, which hits 59.6% at $0.1885/query. The intrinsic failure rate of vision (7% vs. 0% for OCR after retry) is the number to keep: on production pipelines with SLAs, that delta isn't absorbable. Vision LLMs remain brittle on charts and tables — precisely the elements that concentrate value in financial, regulatory, or technical documents.

On the local tooling side, llampart 1.0.0 ships under MIT as a standalone frontend for llama-server (llama.cpp), with MCP integration, 6-language support, and documented Caddy deployment. In the same vein, a web GUI for TradingAgents (Apache 2.0) adds Ollama to the multi-agent stock analysis stack with ~50% token reduction in concise mode. Both releases confirm a structural trend: the local ecosystem is professionalizing around llama.cpp as the reference runtime, with increasingly complete UI layers that narrow the experience gap with cloud APIs.

Today's 5 picks
01
02
03
04
05
Claude Code discovers a reasoning algorithm for $40 — cuts compute 70% vs. standard self-consistency · Signal IA