OpenAI Blog·18 April 2018

Evolved Policy Gradients

Signal

Hype

In three linesOpenAI releases Evolved Policy Gradients (EPG), a metalearning approach that evolves the loss function of learning agents. EPG-trained agents generalize to novel tasks unseen during training, such as navigating to objects placed on different sides of a room.

Read source

Your take?

OpenAI Reinforcement learning Reasoning Papers

Summary generated by Claude — human-verified

Evolved Policy Gradients

Other angles on this story