RealityTest: How People Probe AI Identity and Whether Models Disclose It
Signal
78
Hype
25
In three linesRealityTest evaluates whether AI systems disclose their identity when asked. Multimodal, multilingual benchmark based on 3,152 identity-probing queries from ~750 participants across 49 countries, 5 languages (text and speech). Findings: only 31% ask directly; a single suppression instruction reduces disclosure below 30% even in best-performing models.Read source
Your take?
Summary generated by Claude — human-verified