Today OpenAI, a non-profit artificial intelligence research company, launched OpenAI Gym, a toolkit for developing and comparing reinforcement learning algorithms. It supports teaching agents everything from walking to playing games like Pong or Go.
OpenAI researcher John Schulman shared some details about his organization, and how OpenAI Gym will make it easier for AI researchers to design, iterate and improve their next generation applications. John studied physics at Caltech, and went to UC Berkeley for graduate school. There, after a brief stint in neuroscience, he studied machine learning and robotics under Pieter Abbeel, eventually honing in on reinforcement learning as his primary topic of interest. John lives in Berkeley, California, where he enjoys running in the hills and occasionally going to the gym.
What is OpenAI? What is your mission?
OpenAI is a non-profit artificial intelligence research company. Day to day, we are working on research projects in unsupervised learning and reinforcement learning. Our mission and long-term goal is to advance artificial intelligence in the ways that will maximally benefit humanity as a whole.
What is reinforcement learning? How is it different from supervised and unsupervised learning?
Reinforcement learning (RL) is the branch of machine learning that is concerned with making sequences of decisions. It assumes that there is an agent that is situated in an environment. At each step, the agent takes an action, and it receives an observation and reward from the environment. An RL algorithm seeks to maximize the agent’s total reward, given a previously unknown environment, through a learning process that usually involves lots of trial and error.
The reinforcement learning problem sketched above, involving a reward-maximizing agent, is extremely general, and RL algorithms have been applied in a variety of different fields. They have been applied in business management problems such as deciding how much inventory a store should hold or how it should set prices. They have also been applied to robotic control problems, and rapid development is currently occurring in this area. The following video shows Hopper: a two-dimensional one-legged robot trained to hop forward as fast as possible with OpenAI Gym.