3 - Multi-Agent Systems at JP Morgan with Manuela Veloso
Hi, I’m Monte Zweben, CEO of Splice Machine. You’re listening to ML Minutes, where we solve big problems in little time. Every two weeks, we invite a different thought leader to talk about a problem they’re solving with Machine Learning, with an exciting twist: our guest has only one minute to answer each question. Let’s get started!
This episode, our guest is the venerated Dr. Manuela Veloso. Manuela joined J.P. Morgan in July 2018 to create and head their Artificial Intelligence Research Lab. Manuela is currently on leave from Carnegie Mellon University, where she is the prestigious Herb Simon University Professor in the School of Computer Science. Manuela founded the CORAL research laboratory, for the study of autonomous agents that Collaborate, Observe, Reason, Act, and Learn. Her research includes major works on autonomous robots, Artificial Intelligence, and Machine Learning. Outside of work, Manuela loves to travel with her family!
1:16 Manuela Veloso
Thank you Monte, for that very kind introduction, I'm very pleased to be here too
Manuela, both your work in robotics at Carnegie Mellon and finance at JP Morgan involved the use of multi-agent systems. So I'd love to hear, if you can give our listeners an overview of what a multi-agent system is
So a multi agent system is exactly what the word says. It's multiple agents that try to do a task together. An agent is an AI device with perception, cognition, and action, capable of processing data, making decisions and executing. Multi-agent systems challenge how these multiple agents combine their interests, and combine their tasks in order to perform better than a single agent.
So in a multi-agent system, you're not just getting a single agent to observe their surroundings, create a plan of action given their different choices, and then actually executing an action to put them into a new state. In a multi-agent system, you’re reasoning about how they'll all interact with each other. Why is this important? What applications do you see in robotics and finance for multi agent systems?
Let me explain to you I mean, I actually don't think that anything is single-agent. And so for example, we started this robot soccer, which was the first kind of like challenge to bring multiple decision-makers to play a game of scoring into a goal together in the presence of an opponent. Each one of the agents would need to define their position, decide whether they will pass to this one or to the other, and eventually build this coordination as a team in the presence of the opponent. So an example of a multi-agent system now in finance, you can imagine that any environment where multiple agents need to trade or need to exchange goods, or need to negotiate strategies becomes now a problem of putting together again, multiple interests towards a common goal of eventually trading.
Fantastic. So if we're looking at multi-agent systems that collaborate to achieve a common goal, like in the financial applications you talked about in trading, what tools do you use, perhaps even machine learning tools, to use to study this problem?
So the real world in some sense doesn't enable you to see what drives every single agent in the real world: every single trader, every single interest. So in simulations, by having these multiple agents, with their policies of going from some state, some reasoners, and some actions they take. Basically, we try to bring to the simulations what's happening in the real world. Now, again, as we don't know, this becomes a learning problem, a learning problem in the sense that we would love to figure out what type of policy drives the other agents when we execute our own actions. So for that, we have applied reinforcement learning, multi-agent reinforcement learning techniques to this problem of simulation of markets.
Excellent. So reinforcement learning is one tool you've used to try to learn perhaps what the adversarial behavior might be, so that an agent can take that into consideration in their own planning. Perhaps you can tell us one really significant challenge that came up in some of this research.
So the challenge is that, in fact, we, in the real world, for example, the robot soccer, we don't have that many real examples of what's happening, not many real games. So it's hard to do reinforcement learning in the presence of very few examples. Therefore, that's why we resort to simulations. In simulations, the challenge is that it's not the real world. And therefore, you have to be creating algorithms that can go through all possible scenarios of the real world, eventually to try to learn policies that will be relevant in the real world. But let me tell you that we also use real data in the simulations to do what we call calibrate our agents, by trying to force, in some sense, the actions of our agents in simulation to be similar to the real world, we end up having the agents converge very close to real policies.
Excellent. So the challenge here is the small amounts of data problem. I think this is a very significant problem that that hits many machine learning researchers, where so many of our algorithms depend on large data sets. And it sounds like you solved it, by using simulation to create data to learn from, but used real world constraints to try to frame or shape that simulation so that it's really learning a realistic perspective. That's really interesting. Well, what's next, in your perspective of this research, what's the next big challenge you're going to pursue?
So in the financial world, one way or another, the biggest challenge is to deploy the policies that have been learned, potentially in simulation. Because somehow, if those policies are significantly different from what the human agents use, it's hard to just embrace those policies and bring them to the real world, also, because you may not trust them as much, you may not understand as much why their policies. Therefore, the next step in reinforcement learning in multi-agent systems, or in single-agent systems, has a lot to do with being able to explain more, why these policies are the ones that were reached in simulation. So explainability in the reinforcement learning is quite a challenge.
Very interesting. The explainability of models has been a common topic that many researchers have been talking about, and in fact, one of our bonus questions today surrounds this this particular area. I was curious if you could expand on your perspective on transparency and models in general: how do you think we're doing? What do you think the future holds for us to deliver on that explainability and transparency in machine learning models?
With respect to explainability, I have two kind of like, opposite ways of thinking. One is, let it be with no need to explain anything, we'll just live with it. Like we drive cars without understanding how they move, or we use the toaster, and we don't know, we just press the button, and it works. So we can just think that, well, that's how we will interact with AI systems. They will just be part of our lives. However, there is another part that thinks about the stakeholders of all these AI systems: the developers, the customers and the regulators. The developers want to understand how can they change things, and how can they improve the system? The customers want to have a local explanation: what happened to my particular loan? And the regulators want to know in general, whether the system is going to be faithful to the rules of the game. So we have to worry about explanations.
Thanks Manuela. I think explainability is a great hot topic. I'm sure there'll be many PhD theses on this. And I'd love to ask about approaches that you think may be valuable for explainability. We touched upon it, its importance. But do you have a couple of thoughts on technical approaches that might be interesting to our audience?
So there are two types of technical approaches that we actually are following in explainability. One is the concept of finding from the examples used for training the machine learning or the reinforcement learning system: what are the features? What are the features of the state? What are the conditions or subsets of states that are in fact relevant for the decision? So you try to change all the other features, and none of them matter, except these ones the influence that decisions, so it's kind of trying to find relevant features. But another approach is also in terms of reinforcement learning, which is to try to describe the policy which happens to be a mapping from states to actions in terms of a more interpretable way; for example, a decision tree. So you transform the policy (that function) into a decision tree, which is more interpretable for humans to understand why the decisions are taken.
Thank you. Explainability, being able to be seen through a few different approaches, like feature importance, or transforming models to other kinds of model structures that may be more interpretable by humans, like decision trees, is a great perspective on this particular challenge. I'd like to transition perhaps to a higher level question: What's a common misconception you think people have about artificial intelligence?
So I believe that somehow we have to embrace learning systems that don't do the right thing the first day; it's perfectly fine. They just need to be doing better over time. So this type of understanding that learning is not a one-shot processing of these data, and classifying and be done. I think it's fundamentally something intrinsic to learning that we have to embrace in our algorithms.
Thanks, Manuela. Well, I too, have been intrigued over the years with machine learning techniques that are not inherently empirical with lots of data, but perhaps maybe use more symbolic approaches. Many of those symbolic approaches have been maybe de-emphasized because of the Big Data architectures that we can now deploy for more empirical machine learning techniques. But can you imagine a day when we can take a deep learning network, and actually kind of boot it up with some knowledge already? With an expert that may come from a human? That's a possible future, isn't it?
Exactly, and that it can take data incrementally and get better over time and get feedback, and someone saying this is good, this is not good. And the thing improving and also the thing can proactively ask for more data. It can say, “I'm confused, nothing of what I've seen in training matches where I am now. So I need either more data or advice or you tell me what to do”. This symbiotic way of thinking about humans, machines, different aspects of learning. So we have to grow our minds and of course, build upon the tremendous power of reinforcement learning. But go beyond and think that the AI, the true AI system, the intelligence system will be more intelligent.
That is a fantastic way to close. Manuela, it’s been such a pleasure to meet with you again, I've learned so much. Thank you so much for coming in.
Thank you very much for having me.
If you want to hear Manuela’s thoughts on misconceptions of AI and natural language processing, check out our bonus minutes! They’re linked in the show notes below, and on our website, MLMinutes.com. Next episode, we’ll be discussing ML forest fires. To stay up-to-date on our upcoming guests and giveaways, you can follow our Twitter and Instagram, @MLMinutes. Our intro music is Funkin' It by the Jazzual Suspects, and our outro music is Last Call by Shiny Objects, both on the Om Records Label. ML Minutes is produced and edited by Morgan Sweeney. I’m your host, Monte Zweben, and this was an ML Minute.