- Using world models for decision making: The ability to predict what will happen in the future implies understanding the world itself. I believe to make complex decisions in a complex world, this capability is necessary. How can we build models that learn these world models from data, and use them for decision making?
- Modeling uncertainty about the future: Instead of taking just one sample of what the future could be like, and then constraining your generation based on that, can generative models explicitly model inherent uncertainty about the future? A simple example: I think my opponent will move left with 75% probability, and right with 25% probability, but not forward or backward. Compared to just using the most likely sample of moving left as conditioning, my entire distribution contains more information that can help me make decisions about what to do next, i.e. can help construct a better world model.
- Memory conditioned generation: Today’s RL models can work quite well on a static environment where they only are required to do one task with a clear goal. Generative models for tasks like video generation suffer from inconsistencies for their time series data output. I believe both of these are symptoms of models’ lack of memory. The correct latent space needs to be filled with semantically useful data related to the task at hand to condition generation of actions or other output modalities. Specifically, how can the model learn a representation space optimal for storing only what is necessary to remember, and learn to fill its own memory?
- Compositional generative modeling: Humans understand the world’s factorial components, able to recombine them to solve new examples never seen before. From a raw dataset, how can these factors be discovered and then composed to solve out of distribution generalization problems?
- Other topics: Emergent hierarchy, visual representations, subjective reality in ML models, evolution, history, geopolitics.