AERIC: Anticipatory Hidden-State Monitoring for Implicit Harmful Dialogue
Signal
78
Hype
15
In three linesAERIC is a lightweight safety monitor (387 parameters) detecting implicit harmful dialogue by analyzing hidden states during decoding without additional forward passes. On DiaSafety and Harmful Advice, it improves AUROC from 0.683→0.714 and 0.822→0.858. Deployment adds only 2.34% latency versus 79.40% for Qwen3Guard-Stream-4B.Read source
Your take?
Summary generated by Claude — human-verified