The Uncanny Valley of Chatbots

In 1970, roboticist Masahiro Mori coined the phrase “uncanny valley” (不気味の谷 in the original Japanese) to denote the phenomenon that when a machine seems close-but-not-quite human, it triggers a negative response. We especially see this effect in animated films, such as The Polar Express.

Mismanaged Expectations

While everyone working on human-computer interaction has to watch out for the uncanny valley, chatbots are especially vulnerable to it. Chatbots, by design, strive to be human-like in their interactions. But it’s far too easy to mismanage human expectations and deliver a negative experience.

Indeed, one of the biggest problems facing chatbot developers is expectation management. When you put a conversational interface — especially a voice interface — in front of people, you create the expectation that the machine can hold up its side of the conversation.

Unfortunately, today’s machines are highly limited. Even the Loebner Prize winners, such as Rose and Mitsuku, that represent the state of the art of chatbots, start off as promising conversationalists but are quick to reveal their lack of human intelligence. And don’t even bother trying to have a conversation with Siri, Google Now, or Microsoft’s Tay.

Lack of Affordances

It’s not just that chatbots aren’t lifelike enough. It’s that they don’t provide the right affordances. In other words, we aren’t able to easily figure out what they can and can’t do.

For example, Siri can set a 10-minute timer or compute currency exchanges, but it doesn’t know about train schedules. How do I know this? Trial and error. And it’s hard generalize from that trial and error, beyond figuring out that anything complicated probably won’t work. Like most people, I’ve come to only trust Siri to handle a small set of simple use cases.

Expectation Management

We don’t have this problem with our other human-computer interactions — at least not to the same degree. We generally expect our interactions with machines to be awkward, mechanical, and literal; and machines deliver on those expectations. We might not experience magic and delight, but we also avoid frustration and disillusionment.

Given that today’s chatbots can’t pass for human, they need to manage down our expectations, making the best of their capabilities without trying to hide their weaknesses. As Phil Libin said in “A Charge of Bots”:

The best bots, I think, will be hybrids with multiple interaction modalities. You talk to a bot in whichever way is most natural, and it responds in the most efficient way possible. Think less Turing test and more R2D2. Bots don’t have to pretend to be human; they just have to be fast and effortless.

Creating Affordances

But managing down expectations isn’t enough. For chatbots to be effective, they need to create the right affordances so that we know how to use them.

For example, x.ai offers a chatbot that strives to do one job: helping people schedule meetings. Scheduling a meeting is a relatively bounded problem of determining who, when, and where. It’s still a surprisingly difficult problem — witness this amusing exchange between x.ai and rival chatbot Clara Labs. Nonetheless, both of these chatbots deliver reasonable experiences — and part of their success comes from managing users’ expectations and creating the right affordances.

I’m not suggesting that every chatbot should restrict itself to a narrow domain, though I think it’s not a bad idea. Regardless, to borrow a phrase from Clint Eastwood, a chatbot’s gotta know its limitations. The key challenge of creating a positive user experience is making it easy for the user to figure out what it can and can’t do.

Avoiding the Uncanny Valley

It’s probably not fair to describe today’s chatbots as being in the uncanny valley, since none of the real ones (movies don’t count) are that lifelike. Still, chatbots engage with people in natural language, using the same messaging interfaces people use to communicate with one another. We are slipping into dangerous territory.

Perhaps we’ll someday discover — or rather, invent — the holy grail of artificial intelligence and create chatbots that can truly pass for human. I’m not holding my breath. We’ve got a long, hard slog ahead of us.

So let’s follow Phil’s advice to “think less Turing test and more R2D2.” Chatbots should promote their strengths while exposing their weaknesses. Chatbot developers need to manage expectations and create the right affordances. The Turing test can wait.

High-Class Consultant.