Object Representations for Learning and Reasoning

Thirty-fourth Conference on Neural Information Processing Systems (NeurIPS)

December 11, 2020, Virtual Workshop

Join via the livestream 🎥 · RocketChat · @ORLR_Workshop · #ORLR2020 · Join our community Slack!

A Symmetric and Object-Centric World Model for Stochastic Environments

  • Patrick Emami, Pan He, Anand Rangarajan, and Sanjay Ranka
  • (Oral)
  • PDF


Object-centric world models learn useful representations for planning and control but have so far only been applied to synthetic and deterministic environments. We introduce a perceptual-grouping-based world model for the dual task of extracting object-centric representations and modeling stochastic dynamics in visually complex and noisy video environments. The world model is built upon a novel latent state space model that learns the variance for object discovery and dynamics separately. This design is motivated by the disparity in available information that exists between the discovery component, which takes a provided video frame and decomposes it into objects, and the dynamics component, which predicts representations for future video frames conditioned only on past frames. To learn the dynamics variance, we introduce a best-of-many-rollouts objective. We show that the world model successfully learns accurate and diverse rollouts in a real-world robotic manipulation environment with noisy actions while learning interpretable object-centric representations.