University of Massachusetts Amherst

Search Google Appliance


Nan Jiang - New Results in Statistical Reinforcement Learning

DS Seminar
February 28, 4:00pm
Computer Science Building, Room 150/151

A reception will be held at 3:40 P.M. in the atrium outside the presentation room.


Nan Jiang

University of Michigan

New Results in Statistical Reinforcement Learning




Recently, reinforcement learning (RL) has achieved inspiring success in game playing domains, including human-level control in Atari games and mastering the game of Go. Looking into the future, we expect to build machine learning systems that use RL to turn predictions into actions; applications include robotics, dialog systems, online education, and adaptive medical treatment, to name but a few. 

In this talk, Nan will show how theoretical insights from supervised learning can help understand RL, and better appreciate the unique challenges that arise from multi-stage decision making. The first part of the talk will focus on an interesting phenomenon, that a short planning horizon can produce better policies when there is limited data. He will explain it by making a formal analogy to empirical risk minimization, and argue that a short planning horizon helps avoid overfitting. The second part of the talk concerns a core algorithmic challenge in state-of-the-art RL: sample-efficient exploration in large state spaces. I introduce a new complexity measure, the Bellman rank, which allows us to apply a unified algorithm to a number of important RL settings, in some cases obtaining polynomial sample complexity for the first time.