A reception will be held at 3:40 P.M. in the atrium outside the presentation room.
University of Michigan
New Results in Statistical Reinforcement Learning
Recently, reinforcement learning (RL) has achieved inspiring success in game playing domains, including human-level control in Atari games and mastering the game of Go. Looking into the future, we expect to build machine learning systems that use RL to turn predictions into actions; applications include robotics, dialog systems, online education, and adaptive medical treatment, to name but a few.
In this talk, Nan will show how theoretical insights from supervised learning can help understand RL, and better appreciate the unique challenges that arise from multi-stage decision making. The first part of the talk will focus on an interesting phenomenon, that a short planning horizon can produce better policies when there is limited data. He will explain it by making a formal analogy to empirical risk minimization, and argue that a short planning horizon helps avoid overfitting. The second part of the talk concerns a core algorithmic challenge in state-of-the-art RL: sample-efficient exploration in large state spaces. I introduce a new complexity measure, the Bellman rank, which allows us to apply a unified algorithm to a number of important RL settings, in some cases obtaining polynomial sample complexity for the first time.