Reinforcement learning with prediction-based rewards
OpenAI introduces Random Network Distillation (RND), a prediction-based reinforcement learning method that encourages exploration through curiosity. RND exceeds average human performance on Montezuma's Revenge for the first time.