Evolved Policy Gradients
OpenAI releases Evolved Policy Gradients (EPG), a metalearning approach that evolves the loss function of learning agents. EPG-trained agents generalize to novel tasks unseen during training, such as navigating to objects placed on different sides of a room.