In the post on the Chinese room, while concluding that Searle’s overall thesis isn’t demonstrated, I noted that if he had restricted himself to a more limited assertion, he might have had a point, that the Turing test doesn’t guarantee a system actually understands its subject matter. Although the probability of humans being fooled plummets as the test goes on, it never completely reaches zero. The test depends on human minds to assess whether there is more there than a thin facade. But what exactly is being assessed?
I just finished reading Melanie Mitchell’s Artificial Intelligence: A Guide for Thinking Humans. Mitchell recounts how, in recent years, deep learning networks have broken a lot of new ground. Such networks have demonstrated an uncanny ability to recognize items in photographs, including faces, to learn how to play old Atari games to superhuman levels, and have even made progress in driving cars, among many other things.
But do these systems have any understanding of the actual subject matter they’re dealing with? Or do they have what Daniel Dennett calls “competence without comprehension”?
A clue to the answer is found in what can be done to stymie their performance. Changing a few pixels in a photographic image, in such a manner that humans can’t even notice, can completely defeat a modern neural network’s ability to accurately interpret what is there. Likewise, moving key user interface components of an old Atari game over by one pixel, again unnoticeable to a human player, can completely wreck the prowess of these learning networks. And we’ve all heard about the scenarios that can confuse a self driving car, such as construction zones, or white trucks against a cloudy sky.
In other words, whatever understanding might exist in these networks, it remains thin and brittle, subject to being defeated with unforeseen stimuli or, even worse, being completely fooled from an adversarial attack carefully crafted to fool a recognition (as in the face of a fugitive). These systems lack a deeper understanding of their subject matter. They lack a comprehensive world model.
The AI pioneer Marvin Minsky long ago made the observation that with AI, “easy things are hard,” that is, what is trivially easy for a three year old, often remains completely beyond the capabilities of the most sophisticated AI systems.
Ironically, it’s often things that are hard for humans that computer systems can do easily. The term “computer” originally referred to skilled humans who performed calculations. Computation was the original killer app, a capability that the earliest systems were able to do in a manner that left human computers in the dust. And we all use systems that do accounting, navigation, or complex simulations far better than anything we could do ourselves.
But even the most sophisticated of these systems remain severe idiot savants, supremely capable at a limited skill set, but utterly incapable of applying it in any general manner. Moving beyond these specialized systems into general intelligence has remained an elusive goal in AI research for decades. The difficulty of achieving this is often called, “the barrier of meaning.” But what exactly do we mean by terms like “meaning”, “understanding”, or “worldview”?
As humans, we spend our lives building models of the world. For instance, even a very young child understands the idea of object permanence, that an object won’t go away if you look away from it, at least unless someone or something acts on it. Or unless it itself moves, if it’s that kind of object. We can think of object permanence as an example of intuitive physics, and the understanding that some systems can move themselves as intuitive biology.
As children get older, they also develop intuitive psychology, a theory of mind, an understanding that others have a viewpoint, and an ability to make predictions about how such systems will react in various scenarios, enabling them to navigate social situations.
These intuitive models serve as a foundation that we build symbolic concepts on top of. Arguably, we only understand complex concepts as metaphors of things we understand at a more primal level, that is, a level involving our immediate spatio-temporal models of the world. When we say we “understand” something, what we typically mean is that we can reverse the metaphor back to that core physical knowledge.
So what does this mean for getting an AI to a general level of intelligence? Mitchell notes that a lot of researchers are now thinking that AI will need its own core physical knowledge of the world. Only with that base, will these systems start understanding the concepts they’re working with. In other words, AI will need a physical body, along with time to learn about the world.
But is even that sufficient? The other day I relayed Kingson Man’s and Antonio Damasio’s proposal that AI needs to have feelings rooted in maintaining homeostasis to really get this general understanding. This might make sense if you think about what it means to actually perceive the environment. A perception is essentially a cluster of predictions, predictions made for certain purposes. For us, and any other animal we’re tempted to label “conscious”, those purposes involve survival and procreation, that is, feelings. A system with instincts not calibrated for satisfying selfish genes may perceive the world very differently.
Which leads to the question, how similar to us do these systems need to be to achieve general intelligence? At this point, I think it’s worth noting that it’s somewhat of a conceit to label our own type of intelligence as “general.” A case could be made that we’re a particular type of survival intelligence. Is the base of our intelligence the only one?
On the one hand, if we want AI systems to be able to do what we do, then it seems reasonable to suppose that their intelligence should be built from similar foundations. Although hewing too closely to those foundations opens us to the dangers I noted in the post about Man and Damasio’s proposal. We want intelligent tools, not slaves.
On the other hand, do we necessarily want to limit AI systems by the biases of our species, or even of all animals? One of the things we hope to get from AI are insights that we as a species may be blind to. If we build them too much in our image, it seems like we’d be forgoing those types of benefits.
But it might be that we have little choice. Maybe we have to start by giving them a base similar to ours, until we can learn enough about how this all works. Once we understand more, it may become obvious whether an alternate base is feasible. If they are feasible, those alternate bases might produce astoundingly alien minds.
Even starting with the physical base, I tend to think shooting directly for human level intelligence is unrealistic, at least until we first have fish level, reptile level, mouse level, or primate level intelligence to build upon.
How will we know that we’ve achieved general intelligence? A robust version of the Turing test remains an option, but Mitchell also discusses other interesting tests that go a bit further. One is a Winograd Schema test. Consider the following two sentences:
The city councilmen refused the demonstrators a permit because they feared violence.
The city councilmen refused the demonstrators a permit because they advocated violence.
In the first sentence, who feared violence? In the second, who advocated violence? The answers are clear to humans with a world model, but no current AI can answer it. Winograd Schema challenges attempt to get at whether the system in question has a real conceptual understanding of the text.
Another type of test are Bongard problems, where two groups of images are provided, and a subject is asked to identify the characteristic that distinguishes the two groups from each other, such as all small or all large shapes. It’s a test of pattern matching ability that humans can usually do, but that again are currently beyond machine systems.
I’m not sure these tests are really beyond the ability of a system to conceivably provide answers without deep comprehension, but elaborations like Winograd or Bongard do seem far more robust. But then, a one hour Turing test also seems extremely difficult to pass with shallow algorithms.
So remember, when seeing breathless press releases about some new accomplishment of an AI system, ask yourself whether the new accomplishment shows a break in the barrier of meaning. I don’t doubt that announcement will come some day, but it still seems a long way off.
Unless of course I’m missing something.