One of the most tantalizing tech rumors in recent weeks has centered around The Information’s report that Apple is not only releasing a Siri API for third party developers, but that the end game in mind is some kind of Siri-powered competitor to Amazon’s Echo standalone speaker device. The potential for a more open version of Siri is massive, as Apple’s more open competitors in the voice recognition space can already tell you.
One of the players in the voice recognition space, who happens to not be tied to any particular hardware manufacturer, is Hound, perhaps best known as the sister product of the SoundHound music recognition app. While Shazam is the name brand in that particular field, anyone who has used both can tell you that SoundHound is the superior, more advanced app. Not only can it recognize songs that you hum if you’re trying to remember something you heard, but when you use its more traditional Shazam-style functionality, Soundhound scrolls the lyrics live in sync with the song playing in the room. It’s an impressive piece of software, and it shows just how sharp their developers are.
Videos by VICE
“The real value is owning all the IP and innovating beyond what others can do,” SoundHound CEO and founder Keyvan Mohajer explained to Motherboard. “We own 100% of the IP in this area. We don’t license from anyone; we don’t license from Nuance, Google, [or] Microsoft. Everything that you see on Hound and Houndify is 100% owned by us.” The showpiece of their tech right now is what they call “speech to meaning” as opposed to the usual speech to text and text to meaning. The extra step slows down processing time, and if nothing else, Hound is noticeable faster than competitors. But where Hound really shines is detailed queries like finding a restaurant that has free Wi-Fi but excludes certain delicacies. While it’s somewhat rough around the edges in spots, Hound has the makings of a serious competitor in the voice recognition space.
The Siri API would solve a problem that Mohajer zeroed in on: It’s too hard to develop for, limiting the ways the assistant can be used. “Apple launched Siri five years ago with 12 domains,” or industry parlance for types of voice queries. “So Siri could do weather, stock market, calendar, address book, navigation, music, [and so on], just 12 things. If you think about it, Siri can be more useful if it can do more things. Apple should be adding more domains. But after five years, Siri has gone from 12 domains to 25 domains. A company as powerful as Apple took five years to go from 12 to 25.” Hound, on the other hand? “We went from 50 to 125 in six months.”
After some false starts in the past, voice recognition software has finally taken off with the masses, and Mohajer is, naturally, excited to see what’s next. “The space is becoming extremely hot. Some areas become hot in the area because people talk about it, but they don’t have a lot of traction, so they die away, but this one? We are seeing adoption by users. Voice, natural language, and conversational interfaces are the next big thing. We feel very fortunate that when we worked so hard for 10 years to become ready, the world became ready at the same time.”