July 2017

5 articles

Better exploration with parameter noise

OpenAI shows that adding adaptive noise to reinforcement learning algorithm parameters frequently boosts performance. This exploration method is simple to implement and rarely decreases performance.

Reinforcement learning OpenAI

SIG

HYP

OpenAI Blog·Jul 20

Proximal Policy Optimization

OpenAI releases PPO (Proximal Policy Optimization), a class of reinforcement learning algorithms simpler to implement and tune than existing approaches, with comparable or superior performance. PPO has become OpenAI's default RL algorithm.

OpenAI Reinforcement learning

SIG

HYP

OpenAI Blog·Jul 17

Robust adversarial inputs

OpenAI created adversarial images that reliably fool neural network classifiers across varied scales and perspectives. This challenges a recent claim that self-driving cars would be hard to trick maliciously due to their multi-angle image capture.

AI safety Vision

SIG

HYP

OpenAI Blog·Jul 5

Hindsight Experience Replay

OpenAI publishes Hindsight Experience Replay (HER), a reinforcement learning method enabling agents to learn from failed experiences by retroactively reframing goals. This technique substantially improves training efficiency on complex tasks.

Reinforcement learning OpenAI Papers

SIG

HYP

OpenAI Blog·Jul 1

Teacher–student curriculum learning

OpenAI introduces a teacher-student curriculum learning approach where a teacher model generates progressively harder tasks to train a student model. The method improves learning efficiency by adapting training example difficulty to the student model's skill level.

OpenAI Reinforcement learning

SIG

HYP