arXiv cs.CL·19 May 2026

Confidence Geometry Reveals Trace-Level Correctness in Large Language Model Reasoning

Signal

Hype

In three linesToken-level confidence trajectories in LLMs encode geometric signals linked to reasoning trace correctness. Without access to text or hidden states, low-dimensional representations separate correct from incorrect traces on GSM8K, MATH, and MMLU. NeuralConf, a lightweight estimator, improves confidence-weighted answer aggregation over majority voting.

Read source

Your take?

Reasoning Evals Papers

Summary generated by Claude — human-verified

Confidence Geometry Reveals Trace-Level Correctness in Large Language Model Reasoning

Other angles on this story