Object Representations for Learning and Reasoning
Thirty-fourth Conference on Neural Information Processing Systems (NeurIPS)
December 11, 2020, Virtual Workshop
Unsupervising Vision via Object-Centric World Models
- Sungjin Ahn
- (Invited Talk)
Objects and their interactions are the foundational structure of the world that plays the central role in our perception, reasoning, and control of the world. Incorporating such structural knowledge is thus expected to resolve various limitations of current deep learning systems in reasoning, causality, modularity, and systematic generalization. However, current deep learning systems are limited in providing such structures: they either extensively rely on human annotations or uses unsupervised representations with a minimal uninterpretable structure. In this talk, I present recent advances in object-centric latent variable models. I first argue that this class of models provide a probabilistic modeling framework to learn interpretable, structured, and adaptable representations as well as the compositional imagination of multi-object scenes in an unsupervised manner. Also, I present the benefits of combining symbolic and distributed representations in these models, and an approach to learn three-dimensional scene representations in an object-centric manner. I conclude that human-like AI agents should understand the causal structured of the world and the object-centric representations can be the foundation of building such world models.