Voice artificial intelligence vendor SoundHound AI has launched Dynamic Interaction, a real-time multimodal interface that responds to voice and touch.
Compared to existing voice technology that requires wake words and takes time to process requests, Dynamic Interaction leverages fragment parsing which breaks speech down into partial utterances, and full-duplex audio-visual integration to deliver instantaneous experiences.
“As the Dynamic Interaction demo shows, this technology is incredibly user-friendly and precise. Consumers won’t have to modify how they speak to the voice assistant to get a useful response – they can just speak as naturally as they would to a human. As an added bonus they’ll also have the means to instantly know and edit registered requests. In our 17-year history of developing cutting-edge voice AI, this is perhaps the most important technical leap forward. We believe, just like how Apple's multi-touch technology leapfrogged touch interfaces in 2009, this is a significant disruption in human-computer interfaces,” says Keyvan Mohajer, Co-Founder and CEO of SoundHound.
A breakthrough in conversational AI
The company says this "category-level breakthrough" in conversational AI revolutionizes the customer experience as it allows restaurants and other service industries to use voice AI to improve the quality of experiences.
For example, when ordering food at a restaurant, customers can talk at their natural pace and the system will respond instantly, resembling a human-to-human interaction. In the video below, SoundHound demonstrates how users can order or edit their food order seamlessly:
In addition, Dynamic Interaction can:
- Completely ignore off-topic speech while only responding to domain-specific topics, like the items on a menu.
- Provide continuous feedback to requests via audio and visuals “live” as the customer engages with a device or service, reassuring the customer that the order has been understood
- Allow users to edit and delete requests in real-time
- Provide suggestions to the user based on a real-time interpretation of the user's speech.
Users can also input information with voice and touch interchangeably and the assistant then responds with audio and visual output, deciding when to speak to the user versus simply updating the visual output.
Businesses can use Dynamic Interaction anywhere they might interact with a customer. For restaurants, this could be a drive-thru, a kiosk, a smartphone, a laptop, or even over the phone, where Dynamic Interaction can enable instant verbal and visual interactions.