If you buy into the hype, you might believe an army of intelligent robots is on the march RIGHT NOW, coming to TAKE YOUR JOB.
Sorry to disappoint, folks, but the utopian (or dystopian, depending on your perspective) vision of robots tending to our every need, leaving us free to pursue…whatever we would pursue if we didn’t have to work, is still a fantasy that may not be realized in our lifetimes. A whole lot of research and development is still needed before that happens.
That said, we are a good deal closer than we were just a few years ago, thanks to recent research in machine learning, and specifically, deep learning.
Sorting out the Terms
The term “machine learning” refers to any technology in which a device gains proficiency in a task without human assistance. There are many approaches to machine learning, most of which involve artificial neural networks (ANNs). ANNs attempt to mimic, on an extremely limited scale, the way biological brains work by modeling the connections between “neurons.” A specific type of ANN, and one that has seen great success in the last few years, is the “deep learning” network.
Deep learning refers to the specific way the neurons are modeled. Generally speaking, a deep-learning ANN has its nodes, or neurons, arranged in layers: an input layer, an output layer, and one or more “hidden” layers. Usually each layer includes no more than a handful of nodes; additional nodes increase the system’s complexity, and therefore the time and computing power needed to make the system work.
Each node in a deep learning network is defined by mathematical functions that describe how it responds to an input signal. The signals for the input layer come from a digitized representation of something—an image, say, or a sound recording. The combination of responses by each node results in an output, such as identifying objects in the image or words in the recording. The system “learns” by processing a learning data set—a set of inputs whose desired outputs are known—and comparing the system’s output to the “real” answers. By tweaking the mathematical parameters of its nodes on its own, the system eventually gets better at its task.
Applications for Deep Learning
Although still in its infancy, deep learning so far holds the most promise for machine learning because it has been shown to learn more efficiently and reliably than other approaches. So what is deep learning good for, and where is it going?
Current capabilities of deep-learning systems include:
- Image processing. Deep-learning systems have proven fairly adept at identifying objects and patterns in images. Some applications here include automatic tagging of photographs, reading handwriting, and reading text within images in various languages.
- Sound processing. Understanding speech, identifying musical tunes, and translating spoken words are among the applications being used or under development.
- Medical diagnoses. Using historical data from large numbers of patients, deep-learning systems can help doctors diagnose a wide range of diseases and disorders.
- Financial market transactions. Deep-learning systems can be used to predict which way the markets are likely to go and process transactions accordingly.
As you can see, the current applications are narrow in scope, and the computing power required to train deep-learning systems make it impractical to develop one as a “do-it-yourself” project. But as the technology matures and tools are developed to make deep-learning systems easier to build and train, many more of them will find their way into our daily lives. Much of this will be behind the scenes at first—systems in major corporations, hospitals, and government agencies will be using them to run their operations, unbeknownst to their customers.
When will it really impact the population at large? Remember the army of robots we talked about earlier? Those intelligent robots represent the convergence of multiple deep-learning systems, and will be the most visible manifestations of deep learning. To make these robots work reliably, they will need deep-learning systems that enable them to perform these tasks, among others:
- Recognize objects using various sensors (such as video, radar, LiDAR, and ultrasound) in real time.
- Interact with those objects. Robots will have to learn that picking up and cracking an egg to make an omelet requires different manipulation skills than picking up and opening a jar of salsa.
- Deal with the unknown and unexpected, such as obstacles and missing objects.
- Perceive and understand human emotions and respond accordingly.
We have bits and pieces of these technologies now, but we don’t yet have a complete, integrated R2-D2-style package. But hang on to your hat, because your robot is coming.