Joelle Pineau
Facebook & McGill University

Peter Bartlett
Berkeley AI Research Lab, UC Berkeley 

  Jennifer Listgarten
Center for Computational Biology & Berkeley AI Research, UC Berkeley 

Sergey Levine
Berkeley AI Research Lab, UC Berkeley and Google

Title: Five Myths About Deep Reinforcement Learning

Deep learning enables machines to perceive the world, interpret speech and text, recognize activities in video, and perform dozens of other open-world recognition tasks. However, artificial intelligence systems must be able not only to perceive and recognize, but also to act. To bring the power of deep learning into decision making and control, we must combine it with reinforcement learning or optimal control, which provide a mathematical framework for decision making. This combination, often termed deep reinforcement learning in the literature, has been applied to tasks from robotic control to game playing, but questions remain about its applicability to real-world problems. A major strength of deep learning is its scalability and practicality: the same algorithm that can be trained to recognize MNIST digits and recognize one of a thousand different object categories in ImageNet, with real images, and with minimal human effort. Can the same kind of real-world impact be attained with deep reinforcement learning? The usual concerns surround efficiency (will it ever be practical to train agents with deep RL in the real world?) and generalization (doesn't deep RL overfit catastrophically to the task at hand?). Some have proposed that model-based algorithms, that first learn to predict the future and then use these predictions to act, can mitigate some of these concerns, but these methods give rise to additional questions: can model-based RL ever attain the performance of model-free methods? Can model-based RL algorithms scale to high-dimensional (e.g., raw image) observations? And do model-based RL methods even have anything in common with model-free methods, or will we have to abandon everything we know about those and start over? In this talk, I will address these questions, and show that, not only can model-free RL algorithms already scale to complex real-world settings and achieve excellent generalization, but that model-based RL algorithms can also achieve comparable performance and scale to high-dimensional (e.g., raw image) observations, and conclude with a discussion of how model-free and model-based reinforcement learning algorithms might have more in common than we might at first think.