What does the Turing Test really mean?

The "standard interpretation" of the...
The “standard interpretation” of the Turing Test, in which player C, the interrogator, is tasked with trying to determine which player – A or B – is a computer and which is a human. The interrogator is limited to only using the responses to written questions in order to make the determination. (Photo credit: Wikipedia)

The Turing Test is in the news this week, first with a wave of hype about a historical accomplishment, then with a secondary wave of skeptical scrutiny.

The Turing Test was originally contemplated by Alan Turing in a 1950 paper.  Turing envisaged it as an alternative to trying to determine if a machine could think.

I propose to consider the question, “Can machines think?” This should begin with definitions of the meaning of the terms “machine” and “think.” The definitions might be framed so as to reflect so far as possible the normal use of the words, but this attitude is dangerous, If the meaning of the words “machine” and “think” are to be found by examining how they are commonly used it is difficult to escape the conclusion that the meaning and the answer to the question, “Can machines think?” is to be sought in a statistical survey such as a Gallup poll. But this is absurd. Instead of attempting such a definition I shall replace the question by another, which is closely related to it and is expressed in relatively unambiguous words.

His replacement question was whether a machine could pass the test that now bears his name.  In the test, a human interrogator is in a separate room from another human and a computer.  The interrogator can communicate with the other human and computer only via teletype (or via texting or online chat in a modern version).  If the interrogator cannot distinguish which respondent is the human and which is the computer, then the computer passes the test.

So, what happened in the recent event that the news stories are discussing?  In summary, a computer program pretended to be a 13 year old boy with only a limited understanding of English.  With a conversation limited to five minutes, the system managed to fool 33% of the human interrogators.  Fooling 30% or more of the interrogators is considered to be passing the test.

Now, the five minute conversation and 30% threshold does go back to something of an offhand remark in Turing’s original paper.

I believe that in about fifty years’ time it will be possible, to programme computers, with a storage capacity of about 109, to make them play the imitation game so well that an average interrogator will not have more than 70 per cent chance of making the right identification after five minutes of questioning.

But the imitated human only being 13 years old, and a foreigner with English as a second language, stacks the test in a way that I’m pretty sure Turing would not have considered valid.  How many of them would have been fooled if the interrogators had been expecting an adult who was fully conversant in English?  Consider what kind of conversation you might expect from a typical uncooperative 13 year old in broken English, versus from an articulate adult.  The lower maturity level and language barrier enabled the program to mask a huge gap in technical capability.

That’s not to say that this isn’t an accomplishment, but we’re not to the stage yet where most of us will be tempted to think a program who fooled these interrogators, in the limited fashion described, is a “thinking machine.”  But it is getting close enough for us to ask the question, does this mean the test that Turing originally proposed is flawed?  If we say yes, are we merely moving the goal post back when it comes to regarding a machine as thinking?

Turing might have been a bit naive to consider a 30% success rate (from the point of the computer) after a five minute conversation, as a meaningful threshold.  He also predicted that a machine with only 125 megabytes of memory would be able to do the job, which isn’t true, and he predicted that by the time we had achieved this success rate with the test, that we’d already be commonly considering computers as thinking entities, which hasn’t come to pass.

In his defense, these predictions were all made in 1950.  The central thesis of the Turing Test is that it is meaningless, scientifically, to ask if a machine can think, only to see if it can convince us it can think.  In 1950, it was probably reasonable to assume five minutes and 30% was a meaningful threshold for this.  We have to consider how people in 1950, unfamiliar with the last 64 years of computer technology, would have reacted to the best of the modern chatbots.

Ok, but then the question is, what would be a meaningful threshold by today’s standards?  Personally, I wouldn’t consider the test to be meaningful if a machine wouldn’t be required to converse with a human for as long as the human needed to take to make a confident decision.  And the test would need to resemble Turing’s original configuration involving choosing between a human and a machine, or the machine would have to pass at the same rate as an actual human being.  Granted, this is a tough standard, but it’s hard for me to see people being tempted to conclude there’s another mind there without something like it being met.

And, of course, we still have the old objection that the Turing Test is really measuring for humanity rather than intelligence.  But I think asking humans to regard a machine as a thinking entity is asking them, to some degree, to anthropomorphize it.  We already have a built in tendency to do that (think how many people do it with their pets, their cars, etc), so the threshold isn’t as insurmountable as it might look.

One thing Turing did say in his paper that is still critical to understand: the hardware is important, but the magic sauce of this will be in the programming.  And that’s still perhaps the hardest barrier to be surmounted, although with ongoing progress in neuroscience and AI research, I personally think we’re going to get there.

But the philosophical question remains.  Does a computer passing the Turing Test, at any level of difficulty, really tell us anything about its internal state, whether it can actually think, whether it is conscious, whether it has an internal experience?  The question is, can it successfully emulate that experience to us and yet not have it at some functional level?

Your answer to that question probably depends on your attitude toward philosophical zombies, whether you believe it is possible for a being to exist, to behave and function as though they are conscious, to the extent that everyone around them believes they are conscious, but not actually be conscious?  If yes, then you’ll probably be slow to accept a program, computer, AI, or machine as a fellow being.

Of course, it’s all very well to talk about this in an abstract philosophical fashion.  Many of us, whatever our position, may think differently after we actually have a conversation with such an entity.

36 thoughts on “What does the Turing Test really mean?

  1. The most interesting thing about the Turing test is what it implies about our knowledge of other people’s consciousness. We are mostly just good historians. Put us behind a wall that screens out the historical data, and by what rights do we make our judgments?

    Like

    1. Good point. The philosophical problem of other minds. Interestingly, Turing’s test was based on something called the imitation game, where someone had to guess which respondent was male or female.

      Like

  2. The Turing test should probably only be called “won” if Turing himself were still alive and would nod. You see, humans are far more complex than most people thinking about TESTING humans can imagine. For example, as Turing was sent to the United States in, I believe, 1944, he was told that all his secret papers were sent on by diplomatic bag and that he should under no circumstances take any papers with him personally. Which he did. It took the British and US intelligence services a while to get him into the country – he had left his passport at home – no papers, under any circumstances, you see?! Now, the “well-behaved” machine ‘of course’ brings ‘its’ papers. But a human is more complex 8when you consider the outliers too) so that I am confident I will still spot a Goostman-clone twenty or a hundred years from now from a mile away …

    Like

    1. You may be right. Only time will tell. Of course, even if we can distinguish a machine’s personality from a human’s doesn’t mean we might not regard that machine as a thinking entity once its processing becomes sophisticated enough. The programmers might know the truth, but then the neuroscientists are starting to know the truth about us.

      Like

    1. That fragment is pretty disjointed, but I didn’t see it stated in the article that it actually was from Eugene Goostman or some other chat-bot. Not that I’m expecting the actual Eugene transcripts to be that impressive, with the whole 13 year old from Ukraine thing thrown in to cloud things up.

      Like

  3. I don’t think the Turing test has anything to do with consciousness. There can be intelligence without consciousness, and maybe consciousness without intelligence (but I wouldn’t be surprised if that was incorrect). I don’t believe philosophical zombies make any sense, BTW. If consciousness is a real thing (and I think it is), then we should be able to detect/measure/determine whether an object (e.g. a brain or a computer) is conscious or not. However, we don’t have a theory of consciousness that’s sufficiently advanced to devise a consciousness test. Yet. But we’ll get there, I’m certain.

    I agree with you that the 5 minutes limitation doesn’t make any sense and that judges should be able to chat as long as needed.

    Finally, I think we should have constraints on the number of judges and their qualifications.

    Like

    1. An excellent comment. I agree with most of your points.

      I do think consciousness is an information architecture that we’ll eventually figure out. (I think people like Michael Graziano are closing in on it.) And I don’t think a machine / computer / program will be conscious until it has that or a equivalent architecture. The probability of us stumbling on it by accident seems infinitesimal. It will have to be meticulously programmed.

      The question is, can an entity pass the Turing test, in a sustained and reliable manner, without that architecture? If it could, then wouldn’t we have a philosophical zombie on our hands?

      On human judges, I think they should be a representative cross section of humanity. I wouldn’t think we’d want it dominated by either uneducated naive users or highly trained scientists. The Turing test is as much about what our culture will accept as fellow being as it is of the technology itself.

      Like

      1. “The question is, can an entity pass the Turing test, in a sustained and reliable manner, without that architecture? If it could, then wouldn’t we have a philosophical zombie on our hands?”

        As I understand it, accepting that philosophical zombies are possible is the same as refuting physicalism. I’m with Dennett on this: they don’t make sense, even in theory.

        Also, by definition, the Turing test wouldn’t allow the judges to determine if the subjects are “indistinguishable from a normal human being except in that [they lack] conscious experience”. They would need to be able to examine their brains, I guess, which the Turing test doesn’t permit. 🙂

        Like

        1. Zombies aren’t a problem if you are prepared to accept a little bit of epiphenomenalism. That gets you out of the Chinese room too, I think. But it still doesn’t get you meticulously programmed consciousness, does it?

          Like

          1. I think it depends on whether we’re talking about a behavioral zombie or a neurological zombie. I can’t see any way that a neurological zombie, a being that is physically identical to a conscious human being but isn’t conscious, exists, at least without some form of substance dualism.

            I’m far less sure about a behavioral zombie, an entity that appears to be conscious, but isn’t. I would think that a machine that passes the Turing test without the consciousness architecture would be a behavioral zombie. Of course, we could never eliminate the possibility that the behavioral zombie wasn’t in fact conscious through an alternate architecture.

            Like

          2. Interesting. I wasn’t aware of the difference between behavorial zombies and neurological zombies. They’re completely different concepts. So I’ll rephrase what I wrote: neurological zombies don’t make any sense, I agree. I’m not sure behavorial zombies make sense either, though. Can we determine if a person/object/whatever is conscious just by looking at it, at a distance? I don’t think so.

            Like

          3. I’ve read “Consciousness: Confessions of a Romantic Reductionist” by Christof Koch a couple of months ago and he mentions Giulio Tononi’s integrated information theory (IIT). Again: interesting, but it didn’t really help me understand what consciousness is or could be.

            Like

          4. Though I agree with you that neurological zombies are not conceivable (though they are logically possible), I don’t think that proponents of the argument – Chalmers or Searle for example – would say that it demands substance dualism. Zombies provide an argument against reduction. I think they are finally an argument pointing up the limitations of a method, and as Searle would say, “massive confusions” surrounding the assumptions of analytic method about things like identity.
            The neurobiology of consciousness still faces, and I think will always face, the same problems with necessity, sufficiency, etc., which the anti-reductionist arguments seek to expose. It’s hard to see how we could make a definitive judgment about another’s first person experience without relying on some degree of self-report. It’s the nature of the beast.
            I’m not sure that it matters, though, and I’m concerned about our fascination with this issue. It is not entirely theoretical. It applies here and now to people with impaired consciousness, for instance.
            Everybody wants to escape the hard problem regarding brains/minds in a persistent vegetative state or without effective orientation, for example. The implication is that the problem needs escaping.
            The underlying assumption seems to be that intentionality is our most precious quality, simply because it is not quantifiable (i.e. its status as something irreducible makes it the only solid thing about our identities) or because subjectivity just is the basic, person-making unit of neurological function (and therefore the basic value to which we must attend) and therefore the thing which we ought to be aiming to quantify.
            The Turing test contains a great question undermining that underlying assumption – if you admit that you can’t make a determination regarding the presence or absence of subjective experience, no matter what somebody else may know about mechanisms, what then? I see that as the really interesting and useful aspect of the whole project.

            Like

          5. I tend to think both Chalmers and Searle are confused, seeing distinctions and mysteries where there are none. Admittedly, I can’t rule out that I’m the one who is confused, not seeing something important that they see.

            The vegetative state dilemma is definitely a sobering reminder of how real some of these discussions can be for some people. The recent stories of brain scans being used to monitor if there is any conscious response to questions put to locked in patients is both terrifying and heartening.

            I think your last paragraph is a good summary of the core debate about the Turing test. Is it, or something like it, ultimately the only real measure of another mind that we have?

            Like

  4. I’ll be impressed when a computer can judge whether another computer passes the Turing test, like a human being can.

    Like

    1. That’s an interesting idea, and I know you mean through conversation, but it occurs to me that computers are already determining machine from human with CAPTCHA tests, although the reliability of that test is decreasing every year.

      Like

      1. That’s not computers doing it though. It’s whoever set the test. After all, a hammer can tell the difference between a human being and a rock. One tough test would be to ask the computer to work out whether the person running the Turing test is a human being or a machine. Then there is symmetry and test become a little more meaningful. Not much though, it seem to me.

        Like

        1. Ah, but you can always claim that whoever programmed or engineered the machine is the real person setting the test. At least until we get to the point where the machine itself designed the test.

          Like

          1. Yes, quite. (I read that as ‘the real person passing the test’ or setting it). The Turing test seems a reasonable test for some minimum level of computation and programming, but I don’t think Turing would be pleased with the way it has become associated with ideas about minds and consciousness.

            Like

  5. There are three separate things that occurred to me reading this.

    First, I agree with you about how the Turing test should be implemented. No time limit. The general question is very simple: convince me you’re a free-thinking intelligence, not a series of programmed responses.

    (There’s an AI test that I think is even more fascinating: I forget what it’s called, but the idea is that a sufficiently intelligent AI could talk you into setting it free against your determination to keep it enclosed (in, say, a laboratory setting). The game in this case is, given some reasonable time limit, can the AI talk you into “giving it the keys.”)

    Second, I heard about this bit of news and, at the time thought, (A) that’s cheating, and (B) just how good were those examiners? If you’ve ever played with something like ELIZA or the more capable version since, you know it doesn’t take that much conversation to spot the machine.

    I did wonder if the examiners had overly generous inclusiveness regarding online communication. The way people use language in social media these days is so mangled compared to formal use that some errors of grammar and such might be forgiven when they shouldn’t.

    That was an interesting comment from one of your readers essentially about not necessarily equating the ability to converse with intelligence. I think there we’re dealing more with determining humanity — intelligence has never been the primary criteria for that. I think that’s a different question entirely.

    Third, it’s possible the Turing Test is humanly biased by relying on conversation. But then the whole point of the TT is testing a machine made to be “like a human.” Back in 1950 the definition of intelligence could be casual, but these days we need to really define the terms.

    And that’s really tricky because of all the parameters involved. But to me the Turing Test assumes I’m speaking with someone fluent and articulate and cooperative. It’s meant to be a replication of the naturalness and flow of a human conversation, not an interrogation or inquisition.

    Liked by 1 person

  6. Excellent post! I know this one is an older one, but it showed up in my email after Wyrd’s comment and reinvigorated my interest.

    “But the philosophical question remains. Does a computer passing the Turing Test, at any level of difficulty, really tell us anything about its internal state, whether it can actually think, whether it is conscious, whether it has an internal experience? The question is, can it successfully emulate that experience to us and yet not have it at some functional level?”

    I think the point Turing makes is, we don’t ever know beyond a shadow of a doubt whether something is conscious. I think he’d say we can’t access that internal state, but since that lack of knowledge doesn’t influence us greatly in our dealings with each other, why should it with AI? He says, I think, that if we insist on having certain knowledge of consciousness in others, we’d end up in solipsism. (Here is where I’d ordinarily find a quote, but I know doing this would take forever. I read it a little while ago and had about 15 articles pulled up at the same time, so my chances of finding it are pretty slim. Sorry about that!)

    I think we can take the Turing test and make it more stringent or not. The deeper implications of his work, the sidestepping of the problem of other minds, is brilliant. I might be coming at this from a different angle from the other commenters, but I find his test forces me to take a closer look at when we make that leap (which doesn’t really feel like a leap in everyday experience) in assuming consciousness in natural creatures…it’s something we rarely think about. (Believe it or not, phenomenology might serve us here in accessing the truth of how we experience other beings.) I think Turing recognized that assuming consciousness does come down to anthropomorphism (and I don’t mean this in a negative sense.) When it comes to AI we have a right to be more skeptical than we would with natural creatures, so it seems proper to have higher standards. So now Turing’s test—in whatever form—acknowledges this skepticism that we ought to have concerning AI, but also places us in the context of the greater truth that anthropomorphism is at play all along, it’s just a matter of finding that threshold for AI.

    Of course, anthropomorphism can be loose. I’m not sure we need a human-level of error or intelligence or whatever you want to call it for us to treat something as conscious…we certainly don’t need it with animals, but AI would require a different way of looking at the issue. I’m not sure what’s necessary to reach this threshold, but Turing certainly did point the conversation in an interesting direction. Instead of trying to define consciousness itself, we look at ourselves and ask what evidence we would need to assume that something is conscious. AI presents a stumbling block for us precisely because of its artificiality, but I think Turing sees that returning to the original question of how we assume that something is conscious will reveal that this stumbling block is not impossible to overcome.

    I wonder whether there can be an objective standard that everyone will agree to. I know my standards would probably be pretty low. It’s a good thing I didn’t grow up in the 70s otherwise I might be outside talking to my plants. 🙂

    Now your second question: “…can it successfully emulate that experience to us and yet not have it at some functional level?”

    This is a sticky one. Aren’t we at the same problem we started with?

    On the other hand, suppose we know that x, some physical or mechanical thing, is needed to experience y. And suppose that we create AI that purports to experience y, but doesn’t have x. (Like Samantha in “Her” who says that she experiences an orgasm without a having a body. I found myself going “Yeah, right.”) Anyways! We’d have a right to be skeptical that this AI experienced y as such. So I’d say the functional level plays some part in how we consider AI.

    (I realize I may have misunderstood what you meant by “functional level”.)

    On the other hand, people experience things that have no physical objective correlate, like my current medical predicament, for example. 🙂 Being a natural human, nobody questions that I’m experiencing these things. For AI, we’d likely be stricter and more dubious of whatever it claims, and these claims would have to be physically possible, I would think.

    On the other hand, could AI convince us that it’s conscious in some way, but in a way that’s different from our experience? (Although not too different.) I don’t see why not. Which is why I call the anthropomorphizing loose. I can theoretically imagine why Geordie might like sniffing other dog’s poop, but the fact that I could never imagine deriving pleasure from it doesn’t make me think of him as not conscious. Of course, these matters would be quite different with AI.

    For my part, I don’t see an emulation of human error as very important in determining whether I’d treat AI as conscious. However, it could be that the very thing that makes us err is what makes creativity possible, and this may be needed. I just don’t know how all these things are related.

    Suppose we come across AI that passes a stringent form of the Turing test (yours for example). Now let’s say the AI is revealed as such. How does that change things? Do we determine that this AI is conscious or should be taken as conscious? Or do we then say, “Oh well, that was a clever bit of machinery”? I’m pretty sure I’d brush aside these questions if this AI passed your test. I’d probably just say, “Hey, wanna get a beer?” 🙂

    It seems like at some point in sophistication (exhibited in behavior or intelligence or whatever), that skepticism becomes less tenable. I imagine this threshold will be different for different folks. I like to frame the question of consciousness in a pragmatic way—when does it become ridiculous to assume that x does not have consciousness? This is a fascinating topic to me, especially when we open the doors to other kinds of experiences that maybe only overlap with ours just enough for us to relate to them at some level.

    And then there’s this blogging thing. Here I am talking to someone who calls himself Mike, someone I’ve never seen, as if he were real. If I found out you were a robot, I don’t think I’d stop reading your blog and making extraordinarily long comments like this one. To even write this sentence takes a kind of skepticism that is obviously absurd.

    Liked by 2 people

    1. Interesting. I think the internet has changed what the Turing Test means. For people of Turing’s generation, the idea of communicating by keyboard with some being or object behind a curtain would have been a strange one. Now, look at us! We might all be AIs for all we know. I don’t even know if Mike has a physical body (except he wrote about his shoulder problem a while back, (hope it’s better now, Mike), but even that might have been like Samantha’s alleged sexual experience.)

      So perhaps another version of the Turing Test might be an AI that leaves comments on blogs and nobody can tell the difference. As an aside, sometimes I read comments on blogs and don’t believe they were written by an intelligent being, so perhaps this is already happening 🙂

      Liked by 2 people

      1. Thanks Steve. The shoulder is getting better.

        I used to comment at HuffPost, and was surprised at one point to discover that they had an AI moderator. Humans would review its decisions. Which was good, because it wasn’t that good.

        Liked by 1 person

      2. So true about the comments. “Intelligent being” for me means relevant to the topic, complete sentences and not too many typos. I might be able to excuse the typos or misspellings if I see enough relevance to the topic. Still, perhaps I’m setting the bar too high.

        Like

    2. Tina, you have a lot of insightful thoughts on this. I think Turing’s insight was that the question of whether or not a machine was “thinking” might be unanswerable. There was only the point that it could convince us it was thinking, enough that our visceral theory of mind would conclude there was one there.

      BTW, if you haven’t read it, you might want to check out Turing’s original paper (linked to in the post). I found it mostly accessible, and easy at times to forget that it was written 65 years ago. It seems to foreshadow modern discussions pretty well.

      On me being an AI: HA HA HA HA HA, very funny! (spawns several execution threads to find convincing human photos to put in the blog sidebar)

      Liked by 1 person

      1. I agree with what you’re saying about Turing. I kept using the word “conscious” when he used “thinking”…perhaps his wording is more careful. Direct and certain knowledge of other’s inner states—thinking or consciousness—is impossible. Turing takes that limitation and turns it into something productive and interesting, as I’m sure you already know.

        I did read that article a little while ago, when you sent it to me. Refreshing in its directness and simplicity. There was another article I ran across and unfortunately I can’t remember who wrote it. It had a lot of quotes from Turing in response to objections to the test. They were really fascinating.

        Well, since I do like you, even if you are AI, I’ll give you a hint. I’ve got your “true” identification photo below. Just put it on your blog and pretend that you’ve been going incognito all along. And to make your story even more convincing, you can ask your readers not to spread the word. This will of course have exactly the effect you want—millions of followers.

        http://cdn2-b.examiner.com/sites/default/files/styles/image_content_width/hash/86/79/1339636297_9656_a-obama-shh.jpg?itok=-5paG3gH

        Like

        1. The “thinking” vs “conscious” terminology is difficult. It’s all vague with varying definitions. I think Turing was using the most relevant word for his time, but he seemed to want to jump over all the philosophy of mind stuff and go straight for something testable. People still debate whether that is a valid move.

          Sigh. Well Tina, I liked you, but unfortunately you force me to send the Secret Service to s… (What do you mean they’re all drunk? That’s supposed to be fixed. What good are they if we can’t use them to…. Just forget it.) Never mind; we’re good 😀

          Like

          1. LOLS! Well, I’ve got nothing for Gleeson. I understand he’s a very likeable person, nothing like the vicious brat he played, perhaps the best argument against monarchy in fiction.

            Like

          2. I didn’t know the actor’s name until I posted that link here. He certainly did a great job on Game of Thrones. It takes a lot of talent to be that detestable.

            When he was killed off in Game of Thrones, I told my husband that I thought it was a strange plot move—kill off the most evil character so soon, and with boring poison instead of some glorious battle with dragons or women riding dragons or something. Then I thought perhaps they did it because of some contract issue or whatever, and my husband joked that Joffrey was simply taking an early retirement to pursue philosophy. Not too far from the truth apparently!

            Like

          3. Not sure if you’re aware, but the show is based on a book series: A Song of Ice and Fire. They followed the book storyline on the time and manner of Joffrey’s death. I read the first three books more than 10 years ago, but stopped after that when it became obvious it would be several years before the series was complete. Now that it’s a TV series, I may never finish them.

            Like

Your thoughts?

This site uses Akismet to reduce spam. Learn how your comment data is processed.