Our website uses cookies

Cookies enable us to provide the best experience possible and help us understand how visitors use our website. By browsing Essential Retail Magazine, you agree to our use of cookies.

Okay, I understand Learn more

Google reveals natural sounding voice AI, called Duplex

Google has unveiled natural voice capabilities for its AI-powered Google Assistant.

The internet giant took to the state at I/O 2018 and published a blog post, where it demonstrated how its Google Assistant could book a hair appointment over the phone, with the person on the other end unaware they were speaking with a computer.

The natural language technology, named Google Duplex, mimics human tendencies to use more complex sentences, correct themselves mid-sentence and rely on context.  The astonishing examples, which also include booking a table at a restaurant, even see the AI throw in sound interjections, such as ‘hmmm…’ which gives an even greater impression of natural language.

“Even with today’s state of the art systems, it is often frustrating having to talk to stilted computerised voices that don't understand natural language,” said Yaniv Leviathan, principal engineer, and Yossi Matias, VP of engineering at Google, in its blog post. “In particular, automated phone systems are still struggling to recognize simple words and commands. They don’t engage in a conversation flow and force the caller to adjust to the system instead of the system adjusting to the caller.”

Not only do humans not always speak clearly, phone calls can also suffer from loud background noises and sound quality problems. Google’s Duplex features a recurrent neural network (RNN) designed to cope with these challenges, built using TensorFlow Extended (TFX).

“To obtain its high precision, we trained Duplex’s RNN on a corpus of anonymized phone conversation data,” said Google. “The network uses the output of Google’s automatic speech recognition (ASR) technology, as well as features from the audio, the history of the conversation, the parameters of the conversation (e.g. the desired service for an appointment, or the current time of day) and more. We trained our understanding model separately for each task, but leveraged the shared corpus across tasks. Finally, we used hyperparameter optimization from TFX to further improve the model.”

Google intends to test the Duplex technology further within its Assistant platform this summer.

Earlier this year at NRF 2018, Google’s director of global business development, retail and shopping, Michael Haswell, said the tech giant is only in the “very, very, very early stages with voice” despite a 20 year investment in the technology.