An engineering professor at the University of California, Davis, is striving to make autonomous vehicles, or AVs, safer by changing how researchers train them.
According to Professor Junshan Zhang of the UC Davis Department of Electrical and Computer Engineering, AVs struggle when confronted with unforeseen events, such as a car driving on the wrong side of the road.
Zhang and his lab have developed a new open-source platform called CarDreamer to address this issue. The platform lets an AI model learn how to drive on its own by imagining scenarios — say, taking a right turn at a red light or merging onto the highway — in a simulated world.
The driving idea
The concept of a machine learning model teaching itself might sound futuristic, but it's not new.
The idea — reinforcement learning — is one of several established methods for training artificial intelligence. A famous demonstration of reinforcement learning and its benefits is AlphaGo Zero.
Developed by DeepMind (now Google DeepMind) in 2017, AlphaGo Zero was an iteration of AlphaGo, the company's previously released artificial intelligence for playing the abstract strategy board game Go. Whereas DeepMind researchers built AlphaGo on historical data gathered from games played between humans, AlphaGo Zero developed its expertise by playing against itself.
The researchers let the model's so-called imagination run wild after providing it with the game's rules, trying out different tactics and set-ups to learn from its failures and successes. Through this technique, the model only took 40 days to surpass every human player and prior AI built to play the game.
Current training for AVs is not like AlphaGo Zero, but its forebear. Companies like Tesla and Google's Waymo train their AVs on data gathered from human drivers, says Zhang.
While this is a powerful training method, it limits the vehicle's ability to react to the driving situations demonstrated in the given data. This means, inevitably, there will be shortcomings in the AI's knowledge, as it is impossible to capture the full range of what drivers can encounter on the road.
One of the reasons why reinforcement learning is such an efficient method is that the model isn't limited to the data researchers have on hand. The model can imagine any driving scenario and play out all possible decisions and consequences.
This type of heuristic learning is also not unlike the cognitive processes of humans, who can easily handle unexpected and previously unseen occurrences, thanks to what researchers call internal world models. These mental models, or imagined environments, pull from a vast collection of past events and actions to help us predict the best reactions to novel experiences in the physical world, like what to do when encountering traffic lights going down in a bad storm for the first time.
It is this kind of robustness in the AV's machine learning model that Zhang wants to provide with CarDreamer.
Dream car in a dream world
A dreaming car is one capable of imagination, and that's what Zhang's CarDreamer platform enables, described in a paper published in May 2024. It gives agents, or machine learning models, license to imagine life-like driving scenarios in simulated environments.
The platform uses CARLA, an open-source simulation software developed by the U.S. Department of Transportation that allows researchers to test automated driving systems. With CARLA, CarDreamer can generate maps based on actual city streets.
"New York City or Germany," Zhang said, providing examples of places the platform can simulate. "CARLA has traffic scenarios based upon real-world data, like how people use traffic intersections and traffic lights."
The machine learning model can imagine traffic dynamics through these maps and evaluate the outcomes of different decisions in a dreamed-up world. On simulated roads, the AI model will learn from the good and bad effects of its decisions, for example, being rewarded with a safe merge when using a blinker.
The key behind the platform is state-of-the-art world model algorithms developed by Zhang and his team that allow the models to understand the complex dynamics of driving. Returning to the example of AlphaGo Zero, a world model can be understood as the rules of the game — the parameters that dictate what actions and reactions can occur within a defined system.
"The computer needs to learn the environment, then the state of the environment and then take actions. The action is going to impact the environment, too. The world model is terminology to describe this kind of interaction," Zhang said.
CarDreamer compliments these world models with a task development suite that lets researchers create driving exercises for an agent to work on, such as a hard right turn in an urban area. Researchers can also use the suite to develop new tasks based on the specific learning needs of their AVs.
Working for a research fleet
The fact that CarDreamer is an open-source platform available to anyone is particularly important to Zhang.
Academic researchers often lack access to the necessary driving data to train their machine learning models, as it's unsafe to run experiments on actual city streets, and AV companies tend to keep their user data private.
With CarDreamer, researchers can now bypass both roadblocks. They can run experiments on simulated streets and generate data on improving machine learning models for AVs that are usable by the entire academic community.
And with more data accessible and generated than ever before, CarDreamer promises greater opportunity for insights into optimizing AV design, even broadening its use to other vehicle types.
"We want to expand this platform to other applications, such as drones," Zhang said. "It takes more work, but that is something we want to do in the future."
Media Resources
CarDreamer: Open-Source Learning Platform for World Model based Autonomous Driving (ArXiv.org)
Originally posted on the College of Engineering website.
Matt Marcure is a writer with the UC Davis College of Engineering.