Broadly speaking, I’m interested in understanding the computations that enable intelligence. In my view, the most useful conceptual framework we have for understanding intelligence is reinforcement learning (RL). RL offers a precise way of thinking about the interactions between agents (systems with agency) and their environments. Both aspects –– agent and environment –– are crucial: any intelligent system needs at least enough agency that it can make decisions about how to act in its environment. Without agency, there’s no way to measure intelligence, and without an environment, there’s no way for the agent’s actions to have any effect.
An agent is intelligent to the extent that it selects actions that alter the dynamics of its environment in useful ways. That is to say, the agent’s actions make good outcomes more likely and bad outcomes less likely. But in order to know which outcomes are good and which are bad, the agent needs some sort of utility or reward function. This utility function provides a way of determining outcome preferences. It can be anything from a fine-grained positive/negative reinforcement signal to an abstract goal that the agent needs to achieve. The more intelligent an agent is, the more it can influence its environment toward outcomes that achieve its goals and that lead to positive utility or reward.
Exactly how such a utility function should be constructed is an important area of research. The problem of value alignment focuses on how to properly specify utility functions so that machines reliably do the things we want them to do. No matter how intelligent an AI system is, if its goals aren’t specified properly, it won’t be especially useful.
Even if we assume the utility function is specified perfectly, an agent still needs to learn how its actions influence the environment. This learning problem leads to a number of other important research areas:
- Exploration - how and when to investigate actions that the agent hasn’t tried before. An agent can be more intelligent if it has explored more of what it can do.
- Modeling - developing an internal description of how the environment works. An agent that can reliably predict the consequences of its actions can be more intelligent, since it can use its predictions to plan ahead.
- Abstraction - ignoring irrelevant information and focusing only on what is important for the task at hand (namely, influencing the environment toward positive outcomes). Good abstractions can make the task easier, and poor abstractions can make it harder or even impossible.
- Generalization - how to apply past knowledge and experience in new situations. Agents that can generalize better can learn more quickly, since they need less experience.
These are some of the core problems in artificial intelligence. To develop intelligent agents, we’ll need to understand each of these areas individually, and we’ll also need to understand how they interact with each other. I’m especially interested in the latter aspect. For example, how can we use abstraction to enable better modeling? How can we combine abstraction with exploration? How can we use generalization to make our agents adapt more quickly when we decide to change their utility functions?
These are the sorts of questions that I spend a lot of my time thinking about. My hope is that in searching for answers to these questions, we will learn how make our AI systems more intelligent and better aligned with our values. The long-term goal of my research is to create general-purpose intelligent systems that can help us to understand and solve our most pressing problems.