SoundHound, the audio-recognition company that launched an eponymous music search engine in 2009, first introduced Hound in private beta last summer. After some fine-tuning, the app is now available on iOS and Android. The digital assistant hopes to make human-device interaction as natural as possible. So when you ask Hound a question, you don't have to modify your speech for the software to understand. You talk to it in your normal tone, pace and accent.
When I summoned Siri to ask "Where can I find Blue Bottle coffee?" it first started FaceTime with one of my contacts whose name sounded nothing like my question and then decided to play an EP by a band that I didn't recognize. It was a great reminder of why I don't talk to my iPhone. But when I asked Hound the same question on my handset, it pulled up eight coffee shops with my preferred brew.
To replicate a human-to-human interaction, AI assistants need to get better at understanding your words. According to founder and CEO Keyvan Mohajer, what sets Hound apart from its digital contemporaries is that it relies on speech-to-meaning technology. It combines speech recognition and language understanding simultaneously so that it doesn't rely on any specific keywords.
In addition to understanding your regular speech, it works with multiple variables to pick up on the subtle details of complex, layered questions. If you ask for hotels in the area based on location, price, reviews and personal preferences, the app delivers precise suggestions based entirely on your voice command. When you're in the app, you can start the conversation with "OK, Hound" and lead into a quick search or a complicated query, no swipes or taps needed.
You can also continue to pose follow-up questions until the search is refined to your needs. Every time I did that, Hound remembered the context of the questions. And even though its automated voice kept throwing me off, the back and forth between us felt intuitive and oddly familiar, like a real conversation.
SoundHound wants to make its speech-to-meaning software easy to use so that it can be incorporated into other machines. Its voice-enabling platform Houndify, already used by NVIDIA and Samsung in their respective Internet of Things devices, opens up the technology to anyone who wants to create a new discoverable topic. Since its beta launch in June, the app has gained over 100 domains or data categories including weather, music, nutrition, sport scores and more to make the conversations wide-ranging.
Domains are integral to a virtual assistant. In addition to building and adding new ones, Hound also comes loaded with Uber and Yelp APIs to eliminate the need to swipe and type on the phone. So when you tell Hound your location and ask for an Uber, it rounds up the closest car options in an instant and lets you book the ride directly without ever tapping on the app. This hands-free experience seems convenient for frigid New York nights when the gloves must stay on, but it's also useful for people who don't have time to stop and swipe.
To make things even easier, when asked how much a ride would cost from the current location to a destination, Hound pulls up a quick fare estimate for various Uber options (Pool, X, XL, etc.) and even points out surge pricing. Fare estimating often feels tedious in the ride-sharing app, so this voice-activated shortcut is particularly handy when you're debating splurging on your commute.
Yelp seems much more responsive, too. You can pull up local restaurants based on ratings, cuisines or even exceptions. "What are the closest Asian restaurants, except Chinese?" Hound came back with a bunch of options including Japanese, Korean and others while skipping Chinese. The same question perplexed Siri. She came back with only Chinese options.
The pace and precision of virtual assistants like Hound are indicative of the connected future, in which machines understand the needs of their humans. While Hound doesn't have Siri or Cortana's snark, it has more than enough intelligence to be good at its job.