Imperfect World Models are Exploitable
Formal study of imperfect world model exploitation in RL. Authors define exploitation as divergence between policy preferences in the model versus true environment. They prove exploitation is essentially unavoidable on large policy sets and establish theoretical bridge with reward hacking.