PARALLAX: Separating Genuine Hallucination Detection from Benchmark Construction Artifacts
Signal
82
Hype
15
In three linesPARALLAX reveals that 4 of 6 major hallucination detection benchmarks embed the ground-truth answer in the prompt, allowing a naive baseline (TxTemb) to achieve near-perfect detection without access to model internals. Evaluation of 22 methods across 12 open-source models: most fail under controlled conditions, except SAPLMA and DRIFT (supervised probes on upper-layer hidden states).Read source
Your take?
Summary generated by Claude — human-verified