The Software Development Blog | AndPlus

Alexa Presentation Language - A new future for NLP

Written by Brian Geary | Oct 23, 2018 1:05:00 AM

When Amazon’s Echo product line first appeared in 2014, its user interface was all about—nay, only about—voice commands to and responses from the device’s natural-language processing personality (known in Amazon’s parlance as an “intelligent personal assistant”) called Alexa.

For a non-ambulatory entity that can only hear and speak, Alexa was (and remains) remarkably capable. She can order things from Amazon, place and answer phone calls, look up facts, play music, answer questions about news, weather, and sports, and much more. When integrated with smart-home systems, she can turn lights on and off, lock and unlock doors, and adjust the temperature in the house. Add third party apps (called “skills”) and Alexa’s capabilities can be broadened even more.

Until recently, however, you couldn’t ask Alexa to show you anything, because no Echo devices had display screens.

The Power of Visual Communication

All that changed in 2017 with the release of the Echo Show device, which sported a 7-inch touch screen; the second-generation Show, with a 10-inch screen, was released this year. The Echo Show enables Alexa to, for example, place and receive video calls, show you the view from a security camera, display the lyrics and album art for a song you’ve requested, and show you the recipe for which you’ve asked Alexa’s help.

The visual aspect opens up whole new worlds of potential applications for Alexa. But it wasn’t until the release of the second-generation Echo Show that Amazon gave third-party developers an easier way to build visual-based skills for Alexa: the Alexa Presentation Language, or APL.

All About APL

APL, a part of the Alexa Skills Kit, enables developers to easily marry Alexa’s natural language processing capabilities (both processing spoken commands and generating spoken responses) with visual cues and elements, as well as input from the touch screen or a remote control.

What does this mean? For one thing, it means that Alexa skills developers are not limited to the Echo product line; not only can they develop skills for incorporation into other Amazon devices (Kindle, Fire TV…) but also third-party devices with screens.

Imagine, for example, a smart refrigerator with An Alexa-powered monitoring system built in. “Alexa, show me what’s in the fridge and what’s about to expire.” Or, “Alexa, what can I make with what I have in the fridge?” Perhaps you want to make lasagna this week, but don’t want to buy ingredients you already have. Ask Alexa to make up a grocery list with that constraint and send it to your smartphone. (Or just have Alexa order the items for delivery from your local Whole Foods store.)

From a technical perspective, APL is similar to other languages that developers use for visual user-interface development, and so should have a gentle learning curve. It’s designed to be flexible regarding content placement, which means skills can be easily adapted to different form factors and screen sizes.

APL skills work by generating JSON documents that are sent to the device for processing and rendering on the screen. Elements that can be rendered include images, text (scrolling or fixed), slide shows or other sequences of visual elements, and soon, media files (video and audio), and HTML5 objects. Developers can also synchronize what’s shown on the screen with what Alexa is speaking, thus adding the ability to reinforce the spoken words with visuals (and vice versa).

Endless Possibilities

Much of what’s described here can already be done with conventional programming languages, frameworks, and devices, but the power lies in the ability to leverage Alexa’s natural-language processing capabilities, meaning that developers don’t have to invent it all over again. It also gives companies the flexibility to simply work within the Amazon/Echo ecosystem or enhance that ecosystem with special-purpose devices.

Taken together, it all means that Amazon’s footprint can be expected to grow even larger as more developers find useful and innovative applications for Alexa-powered devices. This will only help such devices continue to evolve from cool gadgets to indispensable tools for daily living.

And if Alexa ever does become ambulatory (i.e., a mobile robot), there will be almost nothing that she can’t help with. Look out, world, here comes Alexa.

Editorial credit: amedley / Shutterstock.com