Back to feed
OpenAI Blog·

Evolved Policy Gradients

Signal
72
Hype
35
In three linesOpenAI releases Evolved Policy Gradients (EPG), a metalearning approach that evolves the loss function of learning agents. EPG-trained agents generalize to novel tasks unseen during training, such as navigating to objects placed on different sides of a room.
Read source
Your take?
OpenAIReinforcement learningReasoningPapers

Summary generated by Claude — human-verified