Diagnosing Multi-step Reasoning Failures in Black-box LLMs via Stepwise Confidence Attribution
Signal
78
Hype
15
In three linesStepwise Confidence Attribution (SCA) diagnoses multi-step reasoning failures in closed-source LLMs by assigning step-level confidence from generated traces alone. Two methods: NIBS (non-parametric) and GIBS (graph-based). On mathematical reasoning and multi-hop QA, SCA reliably identifies error-prone steps and improves self-correction success by up to 13.5%.Read source
Your take?
Summary generated by Claude — human-verified