Virtual assistants on mobile phones are becoming smarter
Siri, Google Now and Microsoft’s Cortana are evolving, but other players might beat them to the punch.
VIDEO: How Australian start-ups can achieve world domination
GRAPHIC: How does Australia compare with the rest of the world?
QUIZ: How much do you really know about startups?
Virtual assistant services like Apple’s Siri, Google Now and Microsoft’s Cortana are evolving out of their voice-centric shells into something more useful. But there are other players with grander visions that might beat them to the punch.
While the tech giants tout their potential, current voice assistant services see little day-to-day use and rarely work as expected. How many times have you queried Google Now, Siri or Cortana only to be greeted with a Google or Bing search in response? And even if voice assistants were more helpful, how many people really want to bark orders at their phone on a frequent basis? That’s why Apple, Google and Microsoft are pushing to transform the likes of Siri and Cortana from their ‘‘passive’’ forms to more ‘‘active’’ ones, where the virtual assistant does useful things before it’s prompted.
You can see shades of this transformation with Google Now when it scans your communications for upcoming events and automatically puts it in your calendar. Siri’s ‘‘Proactive’’ feature in iOS 9 will cross-check your emails for potential names during an incoming call while Cortana will remind you to file that expense report the next time you call accounts.
Facebook has also entered the fray with its unique take on the virtual assistant with ‘‘M’’, which is powered by a mix of real-life people and AI algorithms. The human element allows it to fulfil requests that its competitors can’t, such as making restaurant reservations on your behalf, finding a birthday gift for your spouse or booking weekend getaways. The service is integrated into Facebook’s existing Messenger app and instead of voice commands, requests need to be typed in. The lack of OS level integration means that M lacks the immediacy of its big name competitors. That said, it will be interesting to see how Facebook develops the product over time as it learns more about the personal habits of its users.
But the big four aren’t the only ones pushing the boundaries.
The virtual assistance app from SoundHound (the company behind the popular music identification app) in many ways puts the likes of Cortana, Siri and Google Now to shame with its speed, accuracy and ability to reply to increasingly complicated requests.
Hound has generated a great deal of buzz ever since the company’s CEO, Keyvan Mohajer, released a demo on YouTube showing the app in action. The video racked up 1.5 million views in a few days. The company has since released the app in beta form and while it might seem like it has come out of nowhere, Mohajer and his team have been working on the voice recognition and the natural language understanding tech that powers Hound for over a decade. It wasn’t until the success of SoundHound that they were finally able to throw the necessary funds and resources at the project.
Hound is able to process long strings of voice commands as soon as they are spoken while also taking in the entire question, enabling it to understand negation, something that Siri can’t do. So you can say “show me restaurants except Italian food”. And it can build on previous questions just as if you were speaking to a real person. For example, you can start with “show me hotels near the airport for less than $300 a night starting from Friday?” and then switch it up and ask “What if I check-in on Saturday?”
Hound is currently available only in the US and even there it’s restricted to an invitation-based beta system. However, if the results from users so far are anything to go by, Hound could be a game changer.
Then there’s Qualcomm’s Zeroth, which doesn’t require voice or a data connection to function. The platform seeks to anticipate your actions and perform intuitive tasks entirely on-device and without user input.
Qualcomm has so far showed off the visual capabilities where it can train the smartphone camera to recognise objects, faces and scenes and autonomously adjust manual camera settings based on what it sees to produce the best possible shot.
As the phone recognises what’s actually in the picture, it will also index your pictures and make them searchable by keywords. For example, searching for ‘‘dog at the park’’ will find the correct pictures because it knows what a dog and the park look like. It can even automatically tag people’s names in the picture by cross-checking faces with images in your contacts or social media.
The scene detection and object recognition capabilities are extensive and because it doesn’t offload the processing to the cloud, results are displayed almost instantly. Qualcomm says the detection vocabulary improves the more the camera is used.
There are more than 10 sensors on a modern smartphone that are absorbing information and Qualcomm plans to process all of that data and actively make decisions to improve user experience. Think of it as a form of AI in your smartphone that’s constantly learning about the environment around you and your own personal smartphone habits.
Zeroth is optimised for Qualcomm’s upcoming flagship chip, the Snapdragon 820, which is expected to power most of the premium tier Android smartphones released next year. Eventually, Zeroth is likely to move into cars and wearables as well.
The race to take the virtual assistant to the next level is on. So while today it might be nothing more than a fun novelty, tomorrow it might be indispensable.