What’s the deal with autonomous vehicles? For starters, saving 40,000 lives per year and answering some of the most interesting questions in artificial intelligence.
Boris Sofman, Head of Perception and Trucking Engineering at Waymo, came in to discuss the methods used to train autonomous vehicles, covering where datasets are gathered, how performance is evaluated, and the potential for these approaches to be leveraged across different industries.
Read on for the transcript!
Hey, it’s Monte. I’m doing a couple talks this January: one is the second Feature Store Meetup, where I’ll be speaking along with a technical lead from Uber. The other is TWiMLCon, a machine learning conference lead by my fellow podcaster Sam Charrington. You can find out more at mlminutes.com/events.
Hi, I’m Monte Zweben, CEO of Splice Machine. You’re listening to ML Minutes, where cutting-edge thought leaders discuss Machine Learning in one minute or less.
This episode, our guest is Boris Sofman, head of Perception and Trucking Engineering at Waymo, an autonomous driving technology company. Prior to Waymo, Boris was the co-founder and CEO of Anki, where he developed and shipped over three and a half million racing and entertainment robots and devices all over the world. Welcome Boris.
Thank you, Monte. Real pleasure to be on.
Well, Boris, you've done really interesting things, from your PhD thesis in robotics to what you're doing now at Waymo. Tell us about your journey. How did you get to where you are now?
I started really getting excited about artificial intelligence and robotics. Did a PhD at Carnegie Mellon, where I was focused on off-road autonomous vehicles. There was a lot of the early research at the time was happening at Carnegie Mellon was just super fun. I took a detour from the space for about nine or 10 years, and started Anki, which was a consumer robotics startup focused on a variety of entertainment products, which was a lot of fun, we shipped products all over the world and were able to push on unique capabilities at low cost. And now I am at at Waymo, leading the perception team and the autonomous trucking team.
Excellent. I can't imagine how much fun it is to go from just studying this to putting toys out there for kids to enjoy robotics all the way through to working with autonomous vehicles. Just for our audience to truly understand, why is it so important to have self-driving vehicles? Why is autonomous driving so important?
It's without a doubt one of the biggest technology potentials in the world, both in terms of safety and efficiency. So on the safety front, on the consumer side, you have just in the US 40,000 people a year that die from automotive accidents, and you know that that's just a safety argument. On the efficiency side, we've seen the stats on how much people-- how much time people spend in traffic. And on trucking, there's a shortage of drivers that's 65,000 today, it'll be 225,000 within a handful of years, and the average age is 50. Because nobody wants to go into this field while the demand is actually getting higher and higher. And so you have this like backbone of both personnel and goods transportation, that is now constrained by all these awkward things like commute time, or caps on driver distance, and so forth, that can completely rethink both our urban environments are laid out as well as how our goods get to us.
Interesting. So you're not only solving a convenience problem, this is a true supply chain optimization problem that's constrained by the capacity of drivers.
Tell us about the problems that you're working on now to solve the driverless experience at Waymo.
Yeah, very different flavor of problems than before. Definitely different constraints and different scale of challenges. But it's, it's the most exciting robotics and AI problem I've ever seen in my life. So I get to focus on the autonomous trucking vertical, it's interesting, because it's a slightly more structured environment that you can lean into, than the consumer side, but at the same time, we're dealing with higher speeds and higher safety risks. And so there's different types of complexities. The-- what our goal is, is to get large scale autonomous trucks and vehicles in general, in the case of Waymo, into mass market use, and to do it safely and reliably. In the process, we get to tackle some of the most interesting AI problems that I could have ever imagined and definitely felt a lot of appreciation for them.
Excellent. So in your experience, in this more structured but higher speed environment on the trucking side, what are some of the main uses of machine learning for autonomous driving?
So a lot of the core technology is actually shared between the car side and the trucking side, which has been a really massive advantage because Waymo has 10 years of experience in working on this problem. So it's one of the areas that's most mature in terms of leveraging machine learning as a perception system, as a team I've gotten to work with very closely. We have incredibly large scale data coming in from a large variety of sensors. And at the end of the day, the problem of understanding robustly what is happening in the world around us is a pretty mass scale machine learning problem with a wide variety of categories from objects to semantics to even down to kind of base filtering. And so we're leveraging a huge amount of models simultaneously. And over time, we're actually starting to use far more machine learning approaches, even on the planning side as well, which was traditionally a much more search-driven problem.
Excellent. I read somewhere in the materials that are on the Waymo blog that often some of the model building may actually have human input, or at least be evaluated by human input with drivers that are going along in the testing vehicles there. Can you talk a little bit about how that human expert is in injected into the process?
Yeah, sometimes that's actually one of the best signals that you have where it's hard, particularly when you think about planning and behavior of the vehicle itself, it's very hard to come up with exactly the right answer the way you would on a classifier for, for vehicles or for other animals. In the case of something like planning, oftentimes, the right answer is a human expert driver, whether it's in a truck or a car. And the best metrics could end up being comparisons to that human driver, like how, how were you relative to the level of aggressiveness or conservatism? What choice did you make, versus what they made? Or even in the case of, of a prediction system to understand what might happen in the future for particular other agents in the environment? The best answer is exactly what they did do in the future. And so you could actually use that to have an automated training feedback loop to go back and actually train your systems in a way that's much more geared towards the real world.
Excellent. So I think what you're saying is that, on the planning side of the use of machine learning, where you're determining in a current state, what is the next best action to take to get into a future state that's closer to your objective or goal, that you may actually compare what the system did to what a human would do in that case, and measure the differences and learn from those differences. Is that a fair assessment?
Yeah, that's a fair assessment. And what's really interesting is that that expert could either be your vehicle and what the driver did, or in some classes of problems, what other vehicles did. And so we would, the advantage of one of the advantages that Waymo has is the sheer amount of data that we've collected, where we have over 20 million miles of autonomous driving with all of the data that comes with that. And so there's a giant amount of historical data that we can use to train these large scale systems. And sometimes it's to directly train a behavior or a response to a situation. And sometimes it's actually indirectly propagating those signals back and actually training things like cost functions, or heuristics that help us better influence the overall behavior of the vehicle.
Excellent. Okay. Let's turn to the stack now. What are some of the tools that you're using in your tech stack throughout each layer?
Yeah, so what's interesting is that Waymo realized pretty quickly that you can't just think of autonomous driving as a software problem, there's a huge amount of influence that the hardware and sensors have on this challenge. And so Waymo designs all of its sensors now in-house, and we're actually just bringing up a new generation of lasers and compute and vehicles. And that's a pretty gigantic advantage. And so we have our own custom designed sensors, where we understand the sensitivities, the filters, how to customize it relative to the software. We filter, we propagate it through our systems, we use large scale labeling, we use our own machine learning infrastructure, we leverage a lot of the tool sets off of TensorFlow and related technologies, in order to kind of optimize training and labeling. We use a lot of onboard systems in order to execute this in real time. Large amounts of internal infrastructure and simulation. I, I could go on, but I would go through my minutes, so I'll probably stop.
Fantastic. Okay, now if we had to kind of focus you on one big problem you've had in your time at Waymo that you focused upon that is still really vexing, what is it?
Yeah, I think it's evaluation. Yeah, at the end of the day, you can solve almost any problem if you have a really good metric that you're trying to optimize and improve on. In our case, that's really challenging, particularly when you get to like system-wide performance. And if you have a really good evaluation system, you can use it simultaneously to detect regressions, to give you a gradient on how to improve towards driverless performance and beyond kind of human-relative safety. And you can use it to validate when you're actually ready to launch a system. The challenge is that that's not always obvious how to do that. And in a lot of ways, the sheer, like complexity and dimensionality of this problem makes the evaluation probably one of the holy grails of autonomous driving.
Excellent. And so are the evaluation functions that you're optimizing against? Do they include higher dimensions like trying to optimize the speed by which you reach your destination or minimize the fuel you use? Obviously, always maximizing safety? Are those the kinds of dimensions that might be an evaluation function? Or do you get even much, much more fine grained?
Yeah, that's exactly the right question. It's, um, it varies a lot. Sometimes they're very intuitive, like safety metrics, how close you get to a collision, and some evaluation of risk. So for example, if you're trying to optimize for dangerous situations, cut ins, and interactions, sometimes it's actually much more subtle and less intuitive, where you're trying to optimize for some, you know, intermediate metric that's a measure of progress along the direction you want to go or something like comfort, where you can imagine ways to engineer it, but it's very easy to come up with the wrong answer. And then one of the biggest challenges is where in the end, you only get one choice on what you actually want to do, and the answer is some combination of metrics that might be at odds with each other where you want to make progress, but be safe, and be comfortable and be decisive. And those are kind of over-constrained at times. That's what makes it really hard.
Okay, so I have one more question that's kind of in a bonus area.
In terms of AI, you've been doing AI for a number of years, where do you see AI in 10 years?
Ah, very good question. I do think there's a huge overlap between different applications. And so the sort of tools and breakthroughs that get developed, for example, in autonomous driving, or voice recognition, or computer vision, they actually carry over quite well, because there's a lot of like similarities and flavors. And so what I would expect is that some of these really amazing breakthroughs at new types of, of understanding around large scale data, I think we might be really surprised at how much those types of algorithms and approaches could be leveraged in completely different industries and have a big impact. That becomes a direct consequence of some of the massive areas of investment that's being made in areas like search or autonomous driving. That part is really, really exciting to kind of see. And we've seen this all the time, where you saw the smartphone industry that has massive secondary benefits, and likewise, in other places.
So being able to take maybe some of the techniques that you're using in autonomous driving, and applying it to medical use cases or something like that.
Exactly, yeah, medical, industrial; When you like, think about the, you know, the sensors and the understanding of the world. The hardest problems in the AV space feel like a superset of some of the like, largest AI problems today: so how do you understand human intent? How do you understand massively like interactive systems? Those feel like they're the foundation for so many other AI challenges in front of us.
Boris, this has been quite a bit of fun. It's been great catching up. It's been a pleasure. Thanks so much.
Awesome Monte, always a pleasure, thank you for the conversation. This was great.
If you want to hear Boris nerd out on more technical details, check out our bonus minutes. They're linked in the show notes below and on our website, mlminutes.com. Next episode, we’ll be exploring how machine learning is used to empower workers with a hundred million member community on Wednesday, January 20th. To stay up to date on our upcoming guests and giveaways, you can follow our Twitter and Instagram @MLMinutes. ML Minutes is produced and edited by Morgan Sweeney. I’m your host, Monte Zweben, and this was an ML Minute.