Early Stopping Chain-of-thoughts in Large Language Models
ES-CoT detects answer convergence during chain-of-thought generation to stop inference early. The method reduces inference tokens by 16.08% on average across six reasoning benchmarks while maintaining comparable accuracy to standard CoT.