Can Deep Reinforcement Learning from pixels be made as efficient as from state? - ICRA Keynote
Your player may be in pause and your sound muted when you arrive. Please click play on the video and un-mute to join the workshop LIVE.
Learning from visual observations is a fundamental yet challenging problem in reinforcement learning. Although algorithmic advancements combined with convolutional neural networks have proved to be a recipe for success, it's been widely accepted that learning from pixels is not as efficient as learning from direct access to underlying state. In this talk I will describe our recent work that (almost entirely) bridges the gap in sample complexity between learning from pixels and from state, as empirically validated on the DeepMind Control Suite and Atari games. In fact, I will present two new approaches establishing this new state of the art: Reinforcement Learning with Augmented Data (RAD) and Contrastive Unsupervised Representations for Reinforcement Learning (CURL). At the core of both are data augmentation through random crops. Our approaches outperform prior pixel-based methods, both model-based and model-free, on complex tasks in the DeepMind Control Suite and Atari Games showing 1.9x and 1.6x performance gains at the 100K environment and interaction steps benchmarks respectively.
Professor Pieter Abbeel is Director of the Berkeley Robot Learning Lab and Co-Director of the Berkeley Artificial Intelligence (BAIR) Lab. Abbeel’s research strives to build ever more intelligent systems, which has his lab push the frontiers of deep reinforcement learning, deep imitation learning, deep unsupervised learning, transfer learning, meta-learning, and learning to learn, as well as study the influence of AI on society. His lab also investigates how AI could advance other science and engineering disciplines. Abbeel's Intro to AI class has been taken by over 100K students through edX, and his Deep RL and Deep Unsupervised Learning materials are standard references for AI researchers. Abbeel has founded three companies: Gradescope (AI to help teachers with grading homework and exams), Covariant (AI for robotic automation of warehouses and factories), and Berkeley Open Arms (low-cost, highly capable 7-dof robot arms), advises many AI and robotics start-ups, and is a frequently sought after speaker worldwide for C-suite sessions on AI future and strategy. Abbeel has received many awards and honors, including the PECASE, NSF-CAREER, ONR-YIP, Darpa-YFA, TR35. His work is frequently featured in the press, including the New York Times, Wall Street Journal, BBC, Rolling Stone, Wired, and Tech Review.
Be sure to visit the "Most Popular Questions" tab and vote.