In the 1890s, a German math teacher by the name of Wilhelm von Osten was convinced that his horse, named Hans, was capable of counting, addition, subtraction, square roots—in short, all manner of math problems. Von Osten would ask Hans a math question, and Hans would tap out the answer with one of his hooves. (Obviously, Hans was limited to positive integers.)
Von Osten took Hans on tour, delighting and amazing crowds all over Germany. There were, of course, some skeptics, and a blue-ribbon panel of experts was assembled to investigate. They found nothing fraudulent—it seemed that Hans really was smarter than your average horse.
One holdout skeptic, psychologist Oskar Pfungst, wasn’t convinced. He asked von Osten if he could conduct some experiments with Hans, and von Osten agreed. To make a long story short, it turned out that Hans wasn’t really counting or calculating; he was responding to his audience’s subtle body-language cues to tell when to start and stop tapping his hoof. As horses go, Hans was smart, but not in the way von Osten thought.
What’s the Deal With Virtual Assistants Today?
In a way, today’s virtual assistants—the likes of Amazon’s Alexa, Apple’s Siri, Microsoft’s Cortana, and Google’s Assistant—are like Hans the horse. They’re all smart, each in its own way, but only under narrowly defined circumstances.
Virtual assistants have been around for a few years now, and although they are getting better all the time, all of them still have a way to go before they can really be considered “smart” in a general sense of the word. Let’s have a look at them and what their capabilities are today.
Siri is available on pretty much any Apple device, from its phones, tablets, and laptops to the Apple Watch and the Apple HomePod smart speaker, as well as a very small selection of third-party devices.
However, the Siri experience varies widely among devices. Reviewers have reported that questions Siri handles easily on the iPhone can stump it on the HomePod. Siri also seems to have trouble “hearing” questions or commands when it’s playing music or reading news—in other words, you can easily get it to play something you want to hear, but it’s hard to get it to stop.
Siri excels at certain tasks that challenge other virtual assistants, such as finding nearby restaurants and making reservations at the one you select. Other tasks seem to be beyond its current capabilities; it can tell you driving directions to get from A to B, but it has trouble telling you what bus or train to take if you opt for public transportation.
Google Assistant is available on a much wider range of devices: All Android devices, all iOS devices, and Google’s line of smart speakers, as well as smart speakers from a number of other manufacturers.
In reviews, Google Assistant performed similarly to Siri—good at certain tasks, not so good at others. Google Assistant also suffered from the same “hearing” malady that plagues Siri when the device is playing music or speech.
Amazon’s Alexa is available on a large number of devices and various form factors: Not just smart speakers, but the various audio- and video-enabled Amazon Echo devices, plus Amazon’s Fire TV and Fire tablet devices and a selection of third-party devices. One area that Amazon has not pursued with Alexa is standalone mobile phones—the Alexa app is intended to configure and control the other Alexa-enabled devices, and doesn’t do much else.
Where Alexa stands out is the vast number of third-party apps, or “skills,” that are available. These leverage the natural-language processing capabilities of Alexa to power all kinds of applications. In this respect, Alexa is way ahead of its competitors.
However, as a natural-language user interface, Alexa has its shortcomings, just like the others. Although it’s very good at ordering things from Amazon, certain questions and commands that should be easy to understand can confuse Alexa.
Like Google Assistant, Microsoft’s Cortana is available for Android and iOS devices, as well as Windows PCs. The resemblance ends there, however. Cortana is widely regarded as the weakest of the Big Four, with complex setup and poor voice recognition performance.
What Do We Really Want?
Perhaps now—2019—is a good time for us to collectively ask ourselves, “What do we really want from our virtual assistants?”
If the answer is “Be like human assistants,” we should prepare ourselves for disappointment for the foreseeable future. As reported in Forbes, a comprehensive review of each virtual assistant’s ability to answer questions and perform tasks of low or moderate complexity turned up dismal results for all of them. Each was stronger in certain areas than in others, but overall, calling them “assistants” seems to be more aspirational than actual.
The main problem is that natural-language processing is still very much in its infancy. There is still a lot of time and effort needed in order to make these services reliably understand the questions users are asking, and the context in which they are asked. Ask your favorite virtual assistant “How much is 10 dollars in euros?” and most will be able to answer; but if you follow that up with “What about British pounds?” most will crash and burn—they don’t know that the second question is related to the first.
Another challenge that remains for virtual assistants to overcome is reliable performance in noisy (aka “real”) environments. Most virtual assistants can accurately parse the speech of one person speaking in a quiet environment. But if you’re driving down the freeway and ask your virtual assistant for directions, between the engine and road noise and the kids fighting in the back seat, your virtual assistant will be at a loss to hear you, let alone understand what you’re asking it to do.
And keep in mind that all of this is hard enough in just one language. Multiply that by the vast diversity of human languages and dialects, not to mention regional idioms and terminology and the ambiguities that humans deal with intuitively, and the task is truly daunting.
What Can We Really Expect?
In the short term, what can we expect from virtual assistant technology?
- Expect continued good performance in certain narrow specialties, such as Alexa’s integration with smart home systems.
- Expect incremental improvement in voice recognition and natural-language processing performance. These services are powered by machine learning, and the machines have a lot to learn. They will continue to get better, but it will take time.
- Expect improvements in these services’ ability to “hear” in challenging environments. There is a fair amount of research ongoing in audio processing technology, with applications in everything from smart hearing aids to more-realistic virtual reality experiences. Virtual assistants should benefit as well.
- Expect integration with a wider variety of devices. Kitchen appliances, automobiles, and other special-purpose gizmos can benefit from the addition of a hands-free, natural-language user interface, and virtual-assistant technology can be brought to bear. Look for the technology to turn up in unexpected places—It might seem ridiculous to talk to your microwave to get it to heat up last night’s leftover pizza, but don’t knock it until you’ve tried it.
Bottom line: Be patient, and don’t expect too much too soon. The benefits of this technology are clear, but it will be some time before we have something truly worthy of the term “virtual assistant.”