Is “Deep Learning” a Revolution in Artificial Intelligence?

Can a new technique known as deep learning revolutionize artificial intelligence, as yesterday’s front-page article at the New York Times suggests? There is good reason to be excited about deep learning, a sophisticated “machine learning” algorithm that far exceeds many of its predecessors in its abilities to recognize syllables and images. But there’s also good reason to be skeptical. While the Times reports that “advances in an artificial intelligence technology that can recognize patterns offer the possibility of machines that perform human activities like seeing, listening and thinking,” deep learning takes us, at best, only a small step toward the creation of truly intelligent machines. Deep learning is important work, with immediate practical applications. But it’s not as breathtaking as the front-page story in the New York Times seems to suggest.

The technology on which the Times focusses, deep learning, has its roots in a tradition of “neural networks” that goes back to the late nineteen-fifties. At that time, Frank Rosenblatt attempted to build a kind of mechanical brain called the Perceptron, which was billed as “a machine which senses, recognizes, remembers, and responds like the human mind.” The system was capable of categorizing (within certain limits) some basic shapes like triangles and squares. Crowds were amazed by its potential, and even The New Yorker was taken in, suggesting that this “remarkable machine…[was] capable of what amounts to thought.”

But the buzz eventually fizzled; a critical book written in 1969 by Marvin Minsky and his collaborator Seymour Papert showed that Rosenblatt’s original system was painfully limited, literally blind to some simple logical functions like “exclusive-or” (As in, you can have the cake or the pie, but not both). What had become known as the field of “neural networks” all but disappeared.

Rosenblatt’s ideas reëmerged however in the mid-nineteen-eighties, when Geoff Hinton, then a young professor at Carnegie-Mellon University, helped build more complex networks of virtual neurons that were able to circumvent some of Minsky’s worries. Hinton had included a “hidden layer” of neurons that allowed a new generation of networks to learn more complicated functions (like the exclusive-or that had bedeviled the original Perceptron). Even the new models had serious problems though. They learned slowly and inefficiently, and as Steven Pinker and I showed, couldn’t master even some of the basic things that children do, like learning the past tense of regular verbs. By the late nineteen-nineties, neural networks had again begun to fall out of favor.

Hinton soldiered on, however, making an important advance in 2006, with a new technique that he dubbed deep learning, which itself extends important earlier work by my N.Y.U. colleague, Yann LeCun, and is still in use at Google, Microsoft, and elsewhere. A typical setup is this: a computer is confronted with a large set of data, and on its own asked to sort the elements of that data into categories, a bit like a child who is asked to sort a set of toys, with no specific instructions. The child might sort them by color, by shape, or by function, or by something else. Machine learners try to do this on a grander scale, seeing, for example, millions of handwritten digits, and making guesses about which digits looks more like one another, “clustering” them together based on similarity. Deep learning’s important innovation is to have models learn categories incrementally, attempting to nail down lower-level categories (like letters) before attempting to acquire higher-level categories (like words).

Deep learning excels at this sort of problem, known as unsupervised learning. In some cases it performs far better than its predecessors. It can, for example, learn to identify syllables in a new language better than earlier systems. But it’s still not good enough to reliably recognize or sort objects when the set of possibilities is large. The much-publicized Google system that learned to recognize cats for example, works about seventy per cent better than its predecessors. But it still recognizes less than a sixth of the objects on which it was trained, and it did worse when the objects were rotated or moved to the left or right of an image.

Realistically, deep learning is only part of the larger challenge of building intelligent machines. Such techniques lack ways of representing causal relationships (such as between diseases and their symptoms), and are likely to face challenges in acquiring abstract ideas like “sibling” or “identical to.” They have no obvious ways of performing logical inferences, and they are also still a long way from integrating abstract knowledge, such as information about what objects are, what they are for, and how they are typically used. The most powerful A.I. systems, like Watson, the machine that beat humans in “Jeopardy,” use techniques like deep learning as just one element in a very complicated ensemble of techniques, ranging from the statistical technique of Bayesian inference to deductive reasoning.

In August, I had the chance to speak with Peter Norvig, Director of Google Research, and asked him if he thought that techniques like deep learning could ever solve complicated tasks that are more characteristic of human intelligence, like understanding stories, which is something Norvig used to work on in the nineteen-eighties. Back then, Norvig had written a brilliant review of the previous work on getting machines to understand stories, and fully endorsed an approach that built on classical “symbol-manipulation” techniques. Norvig’s group is now working within Hinton, and Norvig is clearly very interested in seeing what Hinton could come up with. But even Norvig didn’t see how you could build a machine that could understand stories using deep learning alone.

To paraphrase an old parable, Hinton has built a better ladder; but a better ladder doesn’t necessarily get you to the moon.

Gary Marcus, Professor of Psychology at N.Y.U., is author of “Guitar Zero: The Science of Becoming Musical at Any Age” and “Kluge: The Haphazard Evolution of The Human Mind.”

Photograph by Frederic Lewis/Archive Photos/Getty.