arXiv cs.CL·19 May 2026

Monitoring the Internal Monologue: Probe Trajectories Reveal Reasoning Dynamics

Signal

Hype

In three linesInvestigation of LRM internal representations through probe trajectories. Authors show that continuous evolution of concept probability during reasoning predicts final behavior better than static predictions. Max-pooling achieves 95% AUROC across 4 datasets (safety, mathematics).

Read source

Your take?

Reasoning AI safety Evals

Summary generated by Claude — human-verified

Monitoring the Internal Monologue: Probe Trajectories Reveal Reasoning Dynamics

Other angles on this story