Object Representations for Learning and Reasoning
Thirty-fourth Conference on Neural Information Processing Systems (NeurIPS)
December 11, 2020, Virtual Workshop
Discrete Predictive Representation for Long-horizon Planning
- Thanard Kurutach, Julia Peng, Yang Gao, Stuart Russell, and Pieter Abbeel
Discrete representations have been key in enabling robots to plan at more abstract levels and solve temporally-extended tasks more efficiently for decades. However, they typically require expert specifications. On the other hand, deep reinforcement learning aims to learn to solve tasks end-to-end, but struggles with long-horizon tasks. In this work, we propose Discrete Object-factorized Representation Planning (DORP), which learns temporally-abstracted discrete representations from exploratory video data in an unsupervised fashion via a mutual information maximization objective. DORP plans a sequence of abstract states for a low-level model-predictive controller to follow. In our experiments, we show that DORP robustly solves unseen long-horizon tasks. Interestingly, it discovers independent representations per object and binary properties such as a key-and-door.