Back to feed
arXiv cs.AI·

MemAudit: Post-hoc Auditing of Poisoned Agent Memory via Causal Attribution and Structural Anomaly Detection

Signal
78
Hype
25
In three linesMemAudit is a post-hoc auditing framework to detect poisoned memories in LLM agents. It combines a counterfactual influence score and a memory consistency graph to identify malicious records injected through normal interactions. Evaluated against MINJA attack, it reduces success rates from 70% to 0% in QA and 83.3% to 0% in reasoning tasks.
Read source
Your take?
AI AgentsAI safetyRAGPapers

Summary generated by Claude — human-verified