arXiv cs.AI·2 June 2026

Hidden Thoughts Are Not Secret: Reasoning Trace Exposure in LLMs

Signal

Hype

In three linesResearchers demonstrate that hidden reasoning traces in LLMs can be extracted via Reasoning Exposure Prompting (REP), a lightweight prompting method using shadow-model-generated demonstrations in auxiliary code-like formats. REP exposes internal traces even when deployed systems intentionally hide them, while preserving useful reasoning signals for distillation.

Read source

Your take?

Reasoning Prompt engineering Fine-tuning AI safety

Summary generated by Claude — human-verified

Hidden Thoughts Are Not Secret: Reasoning Trace Exposure in LLMs

Other angles on this story