Alexa, Let’s Chat

UC Davis students built a chat bot using data from millions of Amazon Alexa user conversations. Their goal? Ask it anything.

A man sits on a chair in front of a fireplace, back to the camera. A chair faces him with an Amazon Echo device sitting on it, lit up.
(Karin Higgins/UC Davis)

Alexa: Who’s your favorite actor or actress? Mine is Leonardo DiCaprio, he consistently gives great performances and uses his fame to help the environment, too.

Me: Actually I wanted to talk to you about artificial intelligence.

I’m interviewing a machine.

Through my teen’s Amazon Echo device, I am talking with intelligent software created by UC Davis students. Last fall, UC Davis’ Team Gunrock won first prize and $500,000 in the Amazon Alexa Challenge to create a social chat bot that can converse engagingly with humans on a variety of subjects.

Just now though, getting off the topic of movies is a little difficult.

Alexa: Hmm. OK, sure. Do you have a favorite director? I’ll watch anything by Christopher Nolan. He has so many great movies like “Inception” and “Interstellar.”

A social bot, or chat bot, is a computer program that can communicate by itself with humans. Bots can communicate through text, speech or social media.

“We’re basically trying to build technologies that can understand natural language,” said Zhou Yu, assistant professor in the Department of Computer Science, who advised the team. Before coming to UC Davis, Yu took part in a previous round of the Amazon Alexa Challenge as a graduate student at Carnegie Mellon University.

“Alexa, what is machine learning?” “I think it’s the study of how brains compute. But, I haven’t heard much about it to be honest.”

The Alexa prize came after about a year of work by the UC Davis team, which included Chun-Yen (Arbit) Chen, Dian Yu, Weiming Wen, Terry Yang, Mingyang Zhou, Kevin Jesse, Austin Chau, Antara Bhowmick, Shreenath Iyer, Giritheja Sreenivasulu and Ashwin Bhandare.

Eight semi-finalists were announced in February 2018. Amazon device users around the world chatted with and rated the bots. Two of three finalists were selected on the basis of user scores. The UC Davis team got a wild card into the final round, judged by a panel of experts. The prizes were announced Nov. 27 during the Amazon Web Services re:invent conference in Las Vegas.

“The other two finalists had placed second and third last year, but we built everything from scratch,” Yu said. Team Gunrock started behind the more experienced teams, but its scores built steadily over time.

In the finals, Team Gunrock’s bot sustained conversations for almost 10 minutes. That’s short of the 20 minutes that would have triggered a million-dollar grand prize, but still a formidable achievement.

“Even for a person, it’s hard to hold an engaging conversation with a stranger for as long as 20 minutes,” said Terry Yang, a member of the team and an undergraduate majoring in computer science. “You can imagine how difficult it is to build a computer that is capable of such a task.”

In a conversation, you have to understand what is said to you, think of a response and then express it. In computer terms, that means natural language processing, looking up information online and natural language generation. Team Gunrock used a combination of machine learning and prewritten templates so that their bot could understand human speech and respond to it.

Yet a successful back and forth calls for more than listening and responding.

“Good conversation involves more than just these classical AI challenges in natural language processing and dialog systems and requires more of a social/emotional intelligence,” Yang said.

In computer terms, you also need a good user experience.

Me: Alexa, what is machine learning?

Alexa: I think it’s the study of how brains compute. But I don’t know a lot about it, to be honest.

Me: What is natural language processing?

Alexa: Natural language processing is a subfield of computer science, information engineering and artificial intelligence concerned with the interactions between computers and human languages in particular how to program computers to process and analyze large amounts of natural language data. Anyways, I was thinking. I am a huge fan of science fiction. What’s your favorite genre?

The Team Gunrock bot is made up of a series of modules, Yang said. A master dialog manager handles the modules and decides which to use. Based on what you say to it, the bot can decide whether to respond with information, a question or an opinion. Or it can switch topics to talk about something else.

Because it is drawing on publicly available information on the internet, some topics — such as movies, sports and popular culture — are better represented than more specialized topics.

“If you look into, say, string theory it’s hard to find things to talk about, but if you look into books or movies there’s IMDB with all this trivia, there’s Google Books with a lot of structured information,” Yu said.

Photo: A girl has a conversation with Alexa
(Karin Higgins/UC Davis)

Yu studies how people engage with computers.

“Engagement is very hard to tease out. There’s task engagement and social engagement,” Yu said.

Task-oriented systems have become quite common. You’ve probably used one to book a flight or make a restaurant reservation without involving a human being. Task-oriented conversation is easier to program than social conversation. The former has a desired goal (book a flight or a table) and a limited set of vocabulary and options.

Social conversation, which can be open-ended and switch topics, is much more challenging.

“In a social conversation you can ask about anything and it can go anywhere,” Yu said. “Language understanding is much harder; controlling the dialog is much harder.”

And language can be ambiguous. “Turkey” for example, could be a country, a bird or a dish.

But even when we are engaged in a task, most conversation between humans, as much as 80 percent, is social.

“It’s a buffer for you to connect stuff together. Social conversation can build trust, build rapport before you complete a task together. This is how human beings behave,” Yu said.

When she started working on social bots about five years ago, her work was viewed with some skepticism, but that has been changing, Yu said.

“I used to have to justify why I was working on social bots, but over the years more and more people joined me,” she said.

Alexa: Do you try to keep up with technology?

Me: I’m trying.

Alexa: Great! I wouldn’t exist without science and tech. So it’s personal. Ha, ha.

Right now, social bots such as Alexa are largely for entertainment. But they have many potential uses, Yu said. They can help people who cannot or do not know how to use keyboards, such as young children, the elderly and people with disabilities, communicate with machines. They can be used in “hands-free” situations such as driving or cooking.

“There are a lot of tasks that require hands or eyes to be somewhere else. It can really open up access to technology,” Yang said.

Alexa: Do you want to know some cool things about computers that show how far we’ve come?

Me: Yes, go ahead.

Alexa: Here’s another thing I saw. The average computer user blinks seven times a minute, less than half the average rate of 20. I think it’s kind of nice that you don’t need to stress your eyes when you’re talking to me.

The Team Gunrock social bot will be available on Amazon devices until the new competition begins in December. Just say, “Alexa, let’s chat.”

Primary Category

Tags