April 2017

4 articles

Equivalence between policy gradients and soft Q-learning

OpenAI demonstrates mathematical equivalence between policy gradient methods and soft Q-learning in reinforcement learning. This theoretical finding unifies two major RL approaches and enables combining their respective advantages.

Reinforcement learning Papers

SIG

HYP

OpenAI Blog·Apr 10

Stochastic Neural Networks for hierarchical reinforcement learning

OpenAI publishes research on stochastic neural networks for hierarchical reinforcement learning. The method improves agents' ability to decompose complex tasks into sub-objectives.

OpenAI Reinforcement learning Papers

SIG

HYP

OpenAI Blog·Apr 6

Unsupervised sentiment neuron

OpenAI developed an unsupervised system that learns an excellent sentiment representation by being trained only to predict the next character in Amazon reviews.

OpenAI Reasoning Embeddings

SIG

HYP

OpenAI Blog·Apr 1

Spam detection in the physical world

OpenAI has developed a spam-detection AI system trained entirely in simulation and deployed on a physical robot. First application of its kind capable of operating in the real world.

OpenAI Robotics Reinforcement learning

SIG

HYP