Back to feed
OpenAI Blog·

Reinforcement learning with prediction-based rewards

Signal
82
Hype
25
In three linesOpenAI introduces Random Network Distillation (RND), a prediction-based reinforcement learning method that encourages exploration through curiosity. RND exceeds average human performance on Montezuma's Revenge for the first time.
Read source
Your take?
OpenAIReinforcement learningReasoningBenchmarks

Summary generated by Claude — human-verified