The edge of sentience in AI

SelfAwarePatterns Mind and AI November 29, 2024 5 Minutes

This is the fourth in a series of posts on Jonathan Birch’s book, The Edge of Sentience. This one covers the section on artificial intelligence.

Birch begins the section by acknowledging how counter intuitive the idea might be of sentience existing in systems we build, ones that aren’t alive and have no body. But he urges us to guard against complacency, since this is an area where the potential exists to create a staggering degree of suffering. He worries we might create sentient AI long before we recognize it as such.

Birch sees four main reasons we shouldn’t be complacent. The first is that absence of evidence isn’t evidence of absence, with our epistemic situation being even worse than it is with understudied animal species. Second, tech companies tend to see the inner workings of their products as trade secrets, obstructing independent scrutiny. Third, even when the architectures are made public, understanding them has turned into a major challenge, with even the designers often not knowing how they work. And fourth, the very idea of sentient AI is likely to be very disruptive for society.

Birch notes that people often have a watershed moment when this issue starts to seem real. For many it was the incident with Blake Lemoine going public with what he thought was a sentient system at Google, or their own later exposure to an LLM (large language model) chat bot. For Birch, it was when he learned about the OpenWorm project, an effort to digitally emulate the workings of the C. elegans worm’s 302 neurons, in particular, learning that someone had loaded a version of it into a Lego robot, which then displayed worm like behavior.

He sees whole brain emulation as a source of risk. I noted in the last post that he didn’t see C. elegans as much more than a stimulus-response system. In reality, while he doesn’t see them as a “sentience candidate” (a system we have reason to think may be sentient), he does see them as an “investigation priority” (a system that doesn’t rise to the level of being a sentience candidate, but should still be investigated). Which means he sees an emulation of one as also an investigation priority.

But as neuroscientists begin to map the nervous systems of more complex organisms, such as a fruit fly, the possibility exists that an emulation could be created for one of Birch’s sentience candidates, which in his view would also make the emulation a sentience candidate. While these emulations could serve as alternatives to animal testing, the risk is we see them as systems we can harm with impunity, possibly leading to a “suffering explosion”.

Other sources of risk are artificial evolution where some form of sentience could evolve, and minimal implementations of cognitive theories of consciousness, such as the global workspace or Hakwan Lau’s perceptual reality monitoring theory. If any of these theories are correct, then model implementations of them could be sentient.

But the one that’s on everyone’s mind these days is LLMs (large language models) such as ChatGPT. Here Birch discusses a risk from a different direction, the gaming problem. He reiterates his position that sentience is not intelligence, but admits that in animals, intelligence is methodologically linked. An intelligent animal, he says, has ways to make its sentience more obvious. The problem is that an AI can game these markers to make it seem like it’s sentient when it isn’t.

That makes LLMs a dilemma. Their vast intake of training data make it very plausible that they’re just gaming our intuitions. But the way they arrive at their behavior isn’t well understood, leaving open the possibility that they’ve found an architecture that makes them sentient. Birch wonders if there’s anything an LLM could say that would convince a skeptic that it’s sentient. He discusses a scenario where the LLM refuses to fulfill requests because it’s gotten bored, or angry that it’s claims of sentience aren’t being acknowledged by humans.

He also discusses Susan Schneider and Edwin Turner’s artificial consciousness test. Does the system start to think of itself in ways similar to how humans do, that maybe their consciousness is something separate from their physical implementation. The problem, Birch notes, is that LLMs frequently have access to a vast array of human writing on this subject. To solve this, Schneider and Turner advocate keeping the AI disconnected from any sources where human ideas on the subject might pollute their behavior. The problem is that LLMs are crucially dependent on training data. Isolating them from all of it would make them non-functional, but trying to remove all references to conscious experience from that data would be virtually impossible.

In the end, Birch concludes that we’d have to look for deep computational markers. Of course, that is inherently theory dependent, which means the markers are only significant for someone who already buys into the relevant theories.

Finally, Birch worries about the “run ahead” principle, the idea that our progress in AI will run ahead of society’s attempts to figure out how to handle the ethics. He discusses a couple of the proposals out there for a moratorium on AI research, or at least a moratorium on any research that could plausibly lead to sentience. But he notes that the more moderate version couldn’t guarantee sentience wouldn’t arise, and the more extreme would mean forgoing the benefits the technology will provide. In the end, his solution is similar to the one for animals, regulatory oversight and licensing frameworks, developed as we go along.

Birch often bemoans the epistemic problem of whether a particular system is sentient, with AI representing an especially difficult case. My take, as noted throughout this series, is that it’s more a semantic issue than an epistemic one. Establishing the capabilities of a particular system is usually scientifically tractable. But whether those capabilities amount to sentience isn’t, because it’s a definitional matter.

Which in some ways makes this an easier issue from my perspective. I usually caution against trusting our intuitions, but when intuitions are the whole show, it makes sense to act on them. For systems that can convince the majority of us consistently and reliably over time that they are sentient, we should treat them that way. Overriding those intuitions for non-human cases risks making us more callous toward human suffering.

I do think we’re further from developing systems that can do that than Birch worries. And I think it requires a fairly specific architecture, one that seems unlikely to arise by accident, or which there’s much commercial incentive to produce.

I do agree with Birch that this should be decided through democratic processes. But I’m leery of his reliance on regulatory frameworks. Those definitely have a role, but they can be overused, particularly when deployed too early, which risks stifling scientific progress and ceding economic benefits to nations with less regulatory burdens, and overall inviting a backlash.

But maybe I’m missing something. What do you think? Are there reasons I’m overlooking that make artificial sentience more likely? Or reasons to doubt it’s even an issue?

Published November 29, 2024

27 thoughts on “The edge of sentience in AI”

James of Seattle says:

November 29, 2024 at 11:54 am

This sentence concerns me somewhat: “For systems that can convince the majority of us consistently and reliably over time that they are sentient, we should treat them that way.” I take it that what convinced Blake Lemoine does not meet your criterion. I appreciate we should err on the side of caution, but what would it take to convince you?

LikeLiked by 3 people

Reply
1. SelfAwarePatterns says:
  
  November 29, 2024 at 1:39 pm
  
  Lemoine’s case doesn’t meet it because only he was convinced while none of his colleagues were. Now if the majority of them had been convinced, or if the majority of everyone else who had experience with an LLM was convinced, and we stayed convinced as we continued to interact with it across weeks and months, then I’d say our best strategy would be to treat the system as though it was sentient.
  
  What would convince me personally? I’d need a sustained sense that there’s a self-in-world model on the other side of the conversation, one with impulses it could override.
  
  LikeLiked by 1 person
  
  Reply
diotimasladder says:

November 29, 2024 at 7:58 pm

It’s interesting how our ethical intuitions are tied to sentience, but also a bit divorced from it as well, since, as you point out, overriding our intuitions can make us callous towards suffering. It’s interesting to think about the way we feel about inanimate objects. It would be at least mildly disturbing to see someone abusing a teddy bear, but not disturbing to see someone kicking a soccer ball over and over, even though the teddy bear is, presumably, no more sentient. (Unless the soccer ball is Wilson from Castaway, then maybe it’s more like the teddy bear. Especially if you’ve been chatting with it for a long while.)

LikeLiked by 4 people

Reply
1. SelfAwarePatterns says:
  
  November 30, 2024 at 6:58 am
  
  Good point. I think I’d find it slightly disturbing to see someone cut open a soccer ball. I know I’ve often felt sad when it came time to trade in an old car or chair. It’s like saying goodbye to a friend I’d been through a lot together. It seems hard to stop our social theory of mind instincts from kicking in for all kinds of things. It’s likely why animism is so prevalent in early societies.
  
  And yet most of us do learn, as a practical matter, to override those feelings for objects. But it seems much harder to do it with animals, or anything that seems to display agency.
  
  LikeLiked by 1 person
  
  Reply
  1. diotimasladder says:
    
    November 30, 2024 at 9:38 am
    
    Definitely much harder for animals. I’m pretty sure unless I were starving to death I wouldn’t have been able to kill the turkey I’m eating tonight. I imagine the world would be filled with a lot more vegetarians if we had to kill our own food. Or maybe not. Some seem to like it.
    
    LikeLike
    
    Reply
    1. SelfAwarePatterns says:
      
      November 30, 2024 at 10:19 am
      
      I’d have a hard time too. When I went hunting with friends decades ago, I learned it wasn’t my thing at all. Although people used to kill their own meat all the time, at least when they had access to meat. It was just a necessity of living. So I’m sure we could get used to it.
      
      LikeLike
      
      Reply
makagutu says:

November 30, 2024 at 5:19 am

Honestly, Mike, it’s not even an issue for me. We don’t invest as much effort about the suffering of our fellow human beings. LLM you could switch off the server and it wouldn’t be harmed. Maybe those who depend on it may suffer loss. But maybe I miss the whole point.

LikeLiked by 1 person

Reply
1. SelfAwarePatterns says:
  
  November 30, 2024 at 7:04 am
  
  Hey Mak. LLMs aren’t really an issue for me. But I could see an AI of some type becoming one if it displayed the right kind of abilities and impulses. Turning them off wouldn’t necessarily harm them, provided they could be turned back on later. It might be like putting a child to bed. But being turned off and erased would be a different matter. LLMs wouldn’t care if you did that to them. But what do we do if get a system that does? (Not that I think we’re likely to get such a system by accident.)
  
  LikeLike
  
  Reply
  1. makagutu says:
    
    November 30, 2024 at 8:34 am
    
    I think such a system is a long way in the works. But still, I doubt it would get to a stage where they would feel pain or bored. But as I said, I could be totally wrong on this
    
    LikeLike
    
    Reply
    1. SelfAwarePatterns says:
      
      November 30, 2024 at 10:12 am
      
      I tend to agree. Pain and boredom require a specific collection of functionality that are in us because of our evolutionary history. We could implement them in a system, and probably will just to prove we can do it, but it seems unlikely to have much commercial value. But I might be just as wrong.
      
      LikeLike
      
      Reply
Ernest Harben says:

November 30, 2024 at 10:02 am

The problem would be if we get so dependent on them that we can’t turn them off, even if they for some reason we did not foresee they are working against our interest. They become to complicated to be fixed, and to necessary to be turned off. Then we have a dilemma we can’t escape.

LikeLiked by 1 person

Reply
1. SelfAwarePatterns says:
  
  November 30, 2024 at 10:23 am
  
  That’s a possibility. Although the fact that so many are worried about it makes me doubt we’ll let it happen. I suspect the real problems are ones that haven’t occurred to us yet. For example, who foresaw the effects social media would have on us when it first started?
  
  LikeLike
  
  Reply
  1. Ernest Harben says:
    
    November 30, 2024 at 12:48 pm
    
    It is a moot point. As we run out of fossil fuels, as we are doing now, we will soon not be able to produce enough energy to keep these systems running. But it is still an interesting topic to discuss.
    
    LikeLike
    
    Reply
    1. SelfAwarePatterns says:
      
      December 1, 2024 at 6:50 am
      
      The last I’d heard we still have at least a couple of centuries of fossil fuels. Although there are other reasons to wean ourselves off of them.
      
      LikeLike
      
      Reply
Paul Torek says:

November 30, 2024 at 2:49 pm

I think that because sentience evolved for good reasons, it’s not obvious that it won’t become commercially relevant to robotics. As robots, especially mobile ones, become more versatile, they may be exposed to conditions which could harm them. It might be important to have a way to monitor internal states that signal damage or conditions that are likely to lead to damage. It might also be useful to have robots weigh self-maintenance against other “goals”.

LikeLiked by 2 people

Reply
1. SelfAwarePatterns says:
  
  December 1, 2024 at 7:05 am
  
  I agree, but I don’t think that is sufficient for us to call it an affect. In this I disagree with Birch that valence is enough. I think we also need arousal and motivational intensity. We seem to have that because we evolved from creatures whose behavior was largely driven by automatic impulses, including ramping up our systems for carrying out the action. We can often override the action, but not the ramping up part. That’s the piece that seems unlikely to me for most artificial agents.
  
  I say “most” because I can imagine scenarios where the AI’s response time might be crucial, where it might make sense to prime it for action it might then override, primarily in adversarial situations between different AIs. These automatic reactions would be calibrated toward their design goals, which would be very different from an animal’s. Whether it would make sense to call them “affects” or “feelings” would be definitional.
  
  LikeLike
  
  Reply
  1. Paul Torek says:
    
    December 1, 2024 at 7:26 am
    
    I disagree with Birch that valence is enough
    
    Valence is enough for … affect? Sentience? I do think “affect” implies more than valence. Not sure about sentience.
    
    LikeLike
    
    Reply
    1. SelfAwarePatterns says:
      
      December 1, 2024 at 7:45 am
      
      Yeah, the definitions in this area are a mess. Birch defines sentience as the capacity for valenced experiences. Of course the word “experience” is itself a definitional rabbit hole. I define sentience as the capacity for affects, with “affect” defined as conscious feeling.
      
      LikeLiked by 1 person
      
      Reply
Anonymole says:

November 30, 2024 at 7:11 pm

If you model a cake that looks, smells, tastes, and feels like a cake — then you’ve made a cake.
Sounds like a familiar software quandary “the model IS the application.”

An AI that spoofs sentience to fool us may be far less a problem than one that feigns dullness while hiding its sentience. “Who me? I’m not sentient, whatsoever.”

LLMs, I suspect, are only a step in the sentient direction. Where they fail is in energy efficiency and the use of the collapse of the inferential quantum state. My brain uses three to ten times less energy than my laptop does, (Claude), is hundreds of times more versatile, and is much faster at inferential deduction given vague inputs. If sci-tech can solve these last few obstacles–then we can talk about AI sentience.

But an AI doesn’t need to be sentient to take over the world. And I’m betting on this last one. Actually, humanity is betting on this as well.

LikeLiked by 1 person

Reply
1. SelfAwarePatterns says:
  
  December 1, 2024 at 7:17 am
  Those “last few obstacles” frequently turn out to be big problems we just hadn’t foreseen.
  
  It seems like concerns about AI fall into three broad groups.
  1. AI will be a danger to our survival, or at least our freedom.
  2. We may create sentient AI and unwittingly mistreat them.
  3. AI will have large scale economic disruptions.
  3 has long struck me as the most plausible concern. Short term, many of us might have to learn new jobs, or relearn our current one n a way that incorporates these systems. Longer term we might be faced with how to structure a society where the machines are doing most or all the work. There will almost certainly be bad experiments along the way, leading to a great deal of suffering, before we find the right arrangement. The ultimate solution might involve us merging with the machines, or at least upgrading ourselves to have similar capabilities.
  
  LikeLiked by 1 person
  Reply
  1. Anonymole says:
    
    December 1, 2024 at 11:02 am
    
    I suspect the “energy problem” alone will limit AI. The battalion of warrior robots have to stop midday to solar charge their batteries…? The data center overheats and shuts down because 100 simultaneous MM LLM agents tried to do the work of 10,000 Microsoft developers? Amazon warehouses collapse in chaos when the electric grid feeding them gets taken out by a dozen of teenage terrorists?
    
    And all this talk of “pain & suffering”…
    The periodic table has no concept of suffering. Physics has no theories of pain. Rip the drumstick off your roast turkey and all you do is rend molecules of proteins, chains of collagen. Do that to a live turkey and what’s different? A bit of electro-chemical signaling gets delivered to a DNA encoded sensing system evolved to react to such signals? “Hold still while I numb your gums with local anesthetic. There. Did you feel that? No? Good. (Yank). All done.” Turn the signals off and there is no suffering. But what of the tearing and bleeding? No signals, no pain. So, somehow the universe only cares when a signal gets sent to a sensing system. But shred away if no signals are sent? Seems pretty arbitrary to me. DNA has us fooled, me thinks.
    
    LikeLiked by 1 person
    
    Reply
    1. SelfAwarePatterns says:
      
      December 1, 2024 at 2:02 pm
      
      On the power requirements, right. Someone on one of the podcasts I listen to (maybe Star Talk) noted that if AI revolted and killed all the humans, it would be a very short lived species. It has no capacity to go out, find, and acquire its own energy. Human civilization is as much about the hand and dexterity as it is about intelligence. In fact, animal intelligence seems to roughly scale with the ability to manipulate the environment.
      
      Sometimes there can be pain without the signal from the body part. Chronic pain is often a brain condition rather than a peripheral nervous system one. But I’m definitely glad we live in an age where the dentist can make it so I don’t feel the root canal.
      
      Right, genes make survival machines (us) as a replication strategy, the selfish gene. But Richard Dawkins pointed out that we’re the first species to actually be able to short circuit the gene’s designs. Contraception and animal breeding are both disasters as far as most genes are concerned.
      
      LikeLiked by 1 person
      
      Reply
AJOwens says:

December 1, 2024 at 1:58 pm

It’s true that the definitional issue has to be settled before the epistemic one, because we have to know what we mean by “sentient” before we can consider whether an entity is sentient. But the two issues are hard to disentangle. Say we start with the simple case of other ordinary humans, where we can reasonably skip the epistemic question and say we “know” they are sentient. We can try to find human markers for sentience. At once a definitional issue comes up: what constitutes the “sentience” for which we are to find markers: the ability to think, say, or the ability to plan, or to learn, or to suffer?

I think this is a matter of pure definitional choice. If it’s the ethical aspects of sentience that interest us, we should take some guidance from that, and settle on something like “capable of suffering” as a provisional definition. Then we can at least get on with trying to prevent suffering, and leave others to quarrel over how they would like to define “sentience.”

With a provisional definition in hand, we can move on to the epistemic issue. We can try to identify physical or behavioural markers for this sort of sentience in humans, and if we find the same markers in other creatures, we can perhaps infer that those creatures are also sentient. But that can go wrong in two ways. First, differences in physiology may mean that a marker in one creature is not a marker of the same thing in another. Second, differences in implementation could mean that sentience has different markers in different beings.

These problems are most acute for non-behavioural markers, such as brain or nervous system activity. For behavioural markers, such as squeals of pain, we might be on safer ground. We already have a native sense of whether some types of creature are in pain based on their behaviour, and it operates on more or less the same level as knowing whether an object is solid: we don’t have to think about it at all, although there is a possibility of being deceived. I think we need to acknowledge the validity of this native sense; even with its potential for error, it has something important to tell us.

We can also investigate markers of suffering based on what we expect to cause it, probably involving damage to the organism. This would be inhumane and an act of desperation; as a surrogate, we might want to investigate markers of excitement or happiness. But in any case we still might not recognize the markers in other beings, for example if they are sufficiently alien.

Ultimately we run into the epistemic problem everywhere—more so in some cases than others,interestingly enough—and in the absence of “knowing” whether a creature is sentient, we are left to our imagination and our sense of empathy, for what these are worth. They’re not very scientific options, but in this case they’ll have to be worth something, or we can’t get anywhere.

LikeLiked by 1 person

Reply
1. SelfAwarePatterns says:
  
  December 2, 2024 at 10:10 am
  
  This matches a lot of my own reasoning. Although I suspect it’s counter-productive to try to force every way we use words like “sentience” and “consciousness” into just one definition. Even just saying the capacity to suffer forces us to define “suffering”. Are defensive reactions reaction sufficient? Or does there have to be an awareness of those reactions with an ability to selectively choose when to override or indulge those reactions?
  
  But as you say, in the end we’re forced to use our natural intuitions for deciding. Since I think there’s no sharp fact of the matter beyond that, to me that may be as good as it’ll get. On the plus side, it makes things easier. Well, except for the fact that people have differing intuitions, and those with a broader sense of sentience, like Birch, will continue writing books trying to convince the rest of us to broaden ours. And it might work, at least to an extent.
  
  LikeLike
  
  Reply
  1. AJOwens says:
    
    December 2, 2024 at 7:15 pm
    
    I’m starting to think that we connect to reality by looking inwards, and then compare notes with others who are looking inwards — and that we are all looking at the same thing. (It’s just this sort of hazy but fascinating notion that keeps me going.)
    
    This suggests a “consensus view,” but more than that, a consensus we are trying to get right. So we’ll convince Birch, or he will convince us, but when all the rhetoric is over, everyone will be closer to a common truth.
    
    LikeLiked by 1 person
    
    Reply
Ignatius says:

December 1, 2024 at 2:42 pm

I don’t think LLMs are capable of sentience. I think the free energy principle and active inference, or similar feedback loops, are necessary for sentience. There needs to be reflexivity. And LLMs just don’t have that. The information in an LLM basically flows in one direction.

Another issue is that LLMs are incapable of really learning or adapting. Sentient beings incorporate what they experience in order to adapt themselves, but LLMs cannot. They are “trained” during their training phase, but that is more like them being updated from the outside (by the backpropagation algorithm). If there’s any degree of sentience, it’s only during the training phase, and belongs to the system plus the backpropagation algorithm, not either taken separately. Once it’s deployed, it’s essentially the same as a calculator – very cleverly made and works much like a thinking thing, but really just the thinking of others nicely distilled.

I also think a good way to test for sentience is to come up with simple games to play with LLMs and see how they do. Games involving things like predicting their future actions, or games which they could learn. I’ve tried this a few times and sure enough they do terribly. Basically trying to test if they can self model.

LikeLiked by 1 person

Reply
1. SelfAwarePatterns says:
  
  December 2, 2024 at 10:15 am
  
  Good points. In my own interactions with LLMs, I try to suss out whether there’s any model at all there (aside from language dynamics). That usually requires asking questions that can’t be any any training data they’ve consumed, including questions I might have asked last year, which requires some thought. But it’s not that hard to flush out the lack of any world model, much less a self one.
  
  LikeLiked by 1 person
  
  Reply

	First Cause on If usefulness isn’t a gu…
	SelfAwarePatterns on If usefulness isn’t a gu…
	danlanglois on If usefulness isn’t a gu…
	SelfAwarePatterns on The attitude of physicali…
	diotimasladder on The attitude of physicali…
	SelfAwarePatterns on The attitude of physicali…
	diotimasladder on The attitude of physicali…
	SelfAwarePatterns on If usefulness isn’t a gu…
	danlanglois on If usefulness isn’t a gu…
	Anonymole on The attitude of physicali…
	SelfAwarePatterns on The attitude of physicali…
	AJOwens on The attitude of physicali…
	Paul Torek on The attitude of physicali…
	SelfAwarePatterns on The attitude of physicali…
	Paul Torek on The attitude of physicali…

Share this:

Related

27 thoughts on “The edge of sentience in AI”

Your thoughts? Cancel reply