arXiv cs.AI·25 May 2026

MemAudit: Post-hoc Auditing of Poisoned Agent Memory via Causal Attribution and Structural Anomaly Detection

Signal

Hype

In three linesMemAudit is a post-hoc auditing framework to detect poisoned memories in LLM agents. It combines a counterfactual influence score and a memory consistency graph to identify malicious records injected through normal interactions. Evaluated against MINJA attack, it reduces success rates from 70% to 0% in QA and 83.3% to 0% in reasoning tasks.

Read source

Your take?

AI Agents AI safety RAG Papers

Summary generated by Claude — human-verified

MemAudit: Post-hoc Auditing of Poisoned Agent Memory via Causal Attribution and Structural Anomaly Detection

Other angles on this story