Back to feed
arXiv cs.AI·

When Actions Disappear: Adversarial Action Removal in Self-Play Reinforcement Learning

Signal
72
Hype
18
In three linesStudy of adversarial attacks via action removal in self-play reinforcement learning. An attacker selectively removes legal actions from the victim's available set. Across poker games (6 to 5,531 states) and two non-poker domains, learned masking causes more damage than random masking. The attack persists across Q-learning, PPO, NFSP, DQN and shows no recovery under extended masked training.
Read source
Your take?
Reinforcement learningAI safetyBenchmarks

Summary generated by Claude — human-verified