Protecting AI welfare?

John Basl and Eric Schwitzgebel have a short article at Aeon arguing that AI (artificial intelligence) should enjoy the same protection as animals do for scientific research.  They make the point that while AI is a long way off from achieving human level intelligence, it may achieve animal level intelligence, such as the intelligence of a dog or mouse, sometime in the near future.

Animal research is subject to review by IRBs (Institutional Research Boards), committees constituted to provide oversight of research into human or animal subjects, ensuring that ethical standards are followed for such research.  Basl and Schwitzgabel are arguing for similar committees to be formed for AI research.

Eric Schwitzgebel also posted the article on his blog.  What follows is the comment, slightly amended, that I left there.

I definitely think it’s right to start thinking about how AIs might compare to animals.  The usual comparisons with humans is currently far too much of a leap. Although I’m not sure we’re anywhere near dogs and mice yet.  Do we have an AI with the spatial and navigational intelligence of a fruit fly, a bee, or a fish?  Maybe at this point mammals are still too much of a leap.

But it does seem like there is a need for a careful analysis of what a system needs in order to be a subject of moral concern.  Saying it needs to be conscious isn’t helpful, because there is currently no consensus on the definition of consciousness.  Basl and Schwitzgabel mention the capability to have joy and sorrow, which seems like a useful criteria.  Essentially, does the system have something like sentience, the ability to feel, to experience both negative and positive affects?  Suffering in particular seems extremely relevant.

But what is suffering?  The Buddhists seemed to put a lot of early thought into this, identifying desire as the main ingredient, a desire that can’t be satisfied.  My knowledge of Buddhism is limited, but my understanding is that they believe we should convince ourselves out of such desires.  But not all desires are volitional.  For instance, I don’t believe I can really stop desiring not to be injured, or the desire to be alive, and it would be extremely hard to stop caring about friends and family.

For example, if I sustain an injury, the signal from the injury conflicts with the desire for my body to be whole and functional.  I will have an intense reflexive desire to do something about it. Intellectually I might know that there’s nothing I can do but wait to heal.  During the interim, I have to continuously inhibit the reflex to do something, which takes energy. But regardless, the reflex continues to fire and continuously needs to be inhibited, using up energy and disrupting rest.  This is suffering.

But involuntary desires seem like something we have due to the way our minds evolved.  Would we build machines like this (aside from cases where we’re explicitly attempting to replicate animal cognition)?  It seems like machine desires could be satisfied in a way that primal animal desires can’t, by learning that the desire can’t be satisfied at all.  Once that’s known, it’s not productive for one part of the system to keep needling another part to resolve it.

So if a machines sustains damage, damage it can’t fix, it’s not particularly productive for the machine’s control center to continuously cycle through reflex and inhibition.  One signal that the situation can’t be resolved should quiet the reflex, at least for a time.  Although it could always resurface periodically to see if a resolution has become possible.

That’s not to say that some directives might not be judged so critical that we would put them as constant desires in the system.  A caregiver’s desire to ensure the well being of their charge seems like a possible example.  But it seems like this would be something we only used judiciously.  

Another thing to consider is that these systems won’t have a survival instinct.  (Again unless we’re explicitly attempting to replicate organic minds.) That means the inability to fulfill an involuntary and persistent desire wouldn’t have the same implications for them that they do for a living system.  In other words, being turned off or dismantled would not be a solution the system feared.

So, I think we have to be careful with setting up a new regulatory regime.  The vast majority of AI research won’t involve anything even approaching these kinds of issues.  Making all such research subject to additional oversight would be bureaucratic and unproductive.  

But if the researchers are explicitly trying to create a system that might have sentience, then the oversight might be warranted.  In addition, having guidelines on what current research shows on how pain and suffering work, similar to the ones used for animal research, would probably be a good idea.

What do you think?  Is this getting too far ahead of ourselves?  Or is it passed time something like this was implemented?

55 thoughts on “Protecting AI welfare?

  1. I think it is definitely time to start thinking about this. Programmable household robots are just around the corner. I’m expecting my Misty II to be delivered this summer. I intend to experiment with methods for doing various tasks. The first obvious task for me is keeping the floors clean. I could see someone (if not me) creating a “desire” in the robot to keep the floor clean. The system could be (foolishly) arranged such that frustration in the task could progressively lead to behaviors essentially identical to what we would call panic. If the internals of the system are sufficiently similar to ours, which seems likely to me as that may be the most efficient way to get the behaviors desired, then I think suffering may be a good characterization of the situation and should be considered a moral issue.


    Liked by 2 people

    1. Hope you plan to do some blogging on the Misty II. I’d be interested in reading about your experience with it.

      I’m less convinced than you that the system will inevitably be like us, unless we go to a lot of effort to make it like us. It seems like organic brains have a lot of idiosyncrasies that exist only because of our evolutionary background.

      I also wonder how a designer would react if they saw frustration in one of their systems. Would they tone down the desire? Or allow the system to flag the desire as unachievable, at least in the short term, and so get a reprieve from that desire for a while?

      Liked by 1 person

  2. When I say like us, I don’t mean exactly like us. But I can see using neural networks for pattern recognition, and having competing desires and goals, and having an attention mechanism to manage competing goals, etc.

    And dealing with frustration will be a thing, because you may want to have the robot try different things when one doesn’t seem to work. For example, another task for the robot might be lining up shoes in the hall so they are all in neat pairs with the toes to the wall. What if there is an obstacle (another shoe?) between one shoe and the wall. Should the robot give up, or try to move the obstacle? How hard should it try? When do repeated attempts become suffering?

    [and then maybe there will be a need for stoicism for robots]

    Liked by 1 person

  3. I fundamentally agree with your response, Mike. But I don’t like the wording. Pain isn’t bad because we desire to get rid of it, although those facts are closely related. The desire and the suffering are both effects of a common cause, the way our mind/brain is structured. But your underlying point is right, that we could and usually would structure an artificial brain not to have these effects.

    Liked by 1 person

        1. Ah, ok, I see where you’re coming from. I used the word “desire” because it’s the way Buddhist writings are usually translated. But I mean it in the sense of a primal preference, not in terms of a planned want. (Although I do think planned wants can cause suffering too.)


  4. Well, one big difference is that AIs can be dispatched by the pulling of a plug. Dispatching animals involves pain and is less than instantaneous. If the plug on an AI is pulled and then the memory banks replaced with fresh or backup memories, then there would be no memory of pain or even realization of disconnection.


    1. Not meaning to be contrary, but I’m not sure what the difference is. My wife was a passenger in a car that was hit head on by a drunk driver. She doesn’t have any memory of the accident, so no pain, no memory of disconnection. She just woke up in the hospital. (The driver of her vehicle simply never woke up.)



    2. Along the lines of James’ comment, the attitude of the system prior to the pulling of the plug is the main ingredient. I imagine a death penalty inmate has a lot of anxiety, even if they believe the drugs will be administered in a way to make the passing painless.


    1. Anthony,
      The only animals that get protection from us are the big eyed beautiful ones that squeal or otherwise affect us by means of our sympathy. It’s not about them but rather us. That might have been your point though.

      I highly doubt that the human will be able to develop anything sentient for a long time if ever. Like James Cross said, this seems to be “a make-work project for philosophers”.

      Liked by 2 people

      1. An argument can be made that we should be concerned about any system that appears sentient, even if we have grounds for concluding it isn’t. The idea is that if we are cruel to such systems, it degrades our attitudes toward actual sentient entities.

        Liked by 2 people

        1. I completely agree with you on this point. We should treat our machines and tools ethically (whatever that might mean or entail) just because we need the practice. I firmly believe that a person who is capable of empathizing with a machine is more likely to empathize with animals and with other humans. Ethics is like a muscle that needs to be exercised regularly or it will atrophy. I do not see ethical concerns about AI welfare as being a waste of time or resources. On the contrary, we will need a well-developed sense of ethics in order to program ethics into our AI systems. Ethics is not about “playing nice”. Ethics is about considering the long-term consequences of what we do and what we cause to be done.

          Liked by 1 person

          1. In the 18th century, it was considered fine to torture cats because they were thought to be simple automatons without souls. Today most people see them as sentient beings (in the sense of being able to feel and perceive). But it’s striking that violence against humans, as a percentage of the population, was much higher then than it is now. Some of it no doubt related to your point. Brutality toward animals made it easier to be brutal toward humans.


  5. Regarding Buddhism and suffering. The suffering in Buddhism goes beyond the experience of pain and goes to an existential experience of being trapped in world fundamentally beyond our control where everything meaningful to us passes.

    “Dukkha refers to the psychological experience—sometimes conscious, sometimes not conscious—of the profound fact that everything is impermanent, ungraspable, and not really knowable. On some level, we all understand this. All the things we have, we know we don’t really have. All the things we see, we’re not entirely seeing. This is the nature of things, yet we think the opposite. We think that we can know and possess our lives, our loves, our identities, and even our possessions. We can’t. The gap between the reality and the basic human approach to life is dukkha, an experience of basic anxiety or frustration.”

    Constructing an AI brain with an ability to experience pain is one thing. Constructing one with an existential angst might be what would be required for suffering.

    Liked by 2 people

    1. Thanks for the info. Constructing an AI capable of existential angst seems like an act of cruelty. But you don’t think ongoing pain isn’t suffering? Remember, we’re talking about pain, not nociception.


      1. I was referring more to suffering in the Buddhist sense since you brought that up.

        However, it is interesting that the neurotransmitters associated with anxiety are widespread across animals which is why cats, for example, can be treated with Prozac. So suffering at some level even in the Buddhist sense may be present in many non-human creatures at some level. I certainly think Buddhists would believe that to be the case.

        So the idea of suffering would include pain but more than pain since it would be pain accompanied by anxiety (unless you are defining pain to include anxiety).


        1. It seems like there is pain in the sense of sensations we feel with a negative valence, but we often use the word “pain” in the sense of emotional or social pain. That second use can feel metaphorical, until you realize that pain is itself a type of emotion, and that often painkillers such as Ibuprofen can actually help with emotional pain. (I actually made use of that fact when my father died in 2017.)


      2. It would also include anxiety without pain. Many animals can be stressed and show anxiety hence suffer from changes in environment, death of people, etc. without experiencing any direct pain.


        1. When you think about it, anxiety is the impulse to do something, an impulse we often end up inhibiting, although the fact that we have to continuously inhibit can leave us exhausted. It’s why physical activity can often help alleviate it. It enables us to at least partially satisfy the reflex.


  6. “But regardless, the reflex continues to fire and continuously needs to be inhibited, using up energy and disrupting rest. This is suffering.”

    I think that’s just continued pain. The suffering comes as much or more from what you think about it: Can I get help in time, will I be permanently affected, what will happen, etc. On some deep level: Will I die from this?

    “So if a machines sustains damage,”

    That seems a crucial distinction. We need drugs to ignore pain from damage. To the machine, it’s just a signal with a handy switch acting like a drug.

    In the machine, as you suggest, there’s no need for the damage signal to have the commanding immediacy it does for us. It’s just a ‘red light on the dashboard’ so to speak.

    “Another thing to consider is that these systems won’t have a survival instinct.”

    That is not a given, depending on how we approach design. If it follows the one really successful path so far — deep-learning neural networks — we won’t be in control of how the machine “thinks.”

    It has nothing to do with being an organic mind. Any system smart enough to form any of its own goals, almost has to converge on the instrumental goal of self-survival. How else can it successfully pursue its terminal goals?

    There’s a very interesting discussion about how such a machine would view being just turned off or even just updated. The canonical line is, “Imagine someone offered you a pill that would make you hate your family.” You would view such re-programming of your goals with horror. Maybe a machine might decide being updated to version 2.0 was murdering version 1.0.

    I think an important point is that, by the time we’re making machines where ethical considerations matter, we’ll be making machines that are setting their own instrumental goals — they will be deciding how to succeed, and we may find their logic hard to understand.

    Liked by 2 people

    1. “I think that’s just continued pain.”

      To me, that is suffering. Although I agree that our thoughts about it can make it much worse. As I noted on the crawfish thread, they’re missing most, if not all, of the existential despair we’d have in their situation.

      “Any system smart enough to form any of its own goals, almost has to converge on the instrumental goal of self-survival.”

      This seems like a logical leap to me. If a machine has to use its intelligence to clean the house, why does it need a survival instinct? I can see it having an impulse to remain functional to perform its task, but that seems different than the strong survival impulse we feel. Put another way, what are the logical steps from pursuing its own goals to converging on self survival?


      1. “To me, [continued pain] is suffering.”

        That’s the key: If you think so, then it is. A true masocist might view it as Tuesday night.

        “This seems like a logical leap to me.”

        Sure it is, but it’s not a far distance to leap.

        “If a machine has to use its intelligence to clean the house, why does it need a survival instinct?”

        If it’s smart enough to decide how to clean the house — its goal in “life” — isn’t it also smart enough to connect two simple dots: IF I die, THEN I cannot clear the house.

        We’re talking about machines we’d be concerned about killing, so we’re talking about something along the lines of a robot maid or butler, not a Roomba. Even the dumbest humans can figure out they need to keep their job. Why wouldn’t a robot mind not grasp the connection between extinction and its goals?

        Are you granting it the autonomy to figure out house-cleaning on its own, how to understand the distinction between a mess the owners want left alone and the messes to clean?

        Or maybe look at it like this: if it wasn’t an AGI capable of making these connections, and was just a smart machine of limited function, would we care about “killing” it at all?


        1. The difference, I think, is that for us survival is a primal goal. We don’t want to survive to do something. We just want to survive. (There are exceptions, but they are exceptional.)

          A cleaning robot might want to survive to keep the house clean, but it’s continued functionality is only an intermediate goal. If it’s to be replaced by a newer model, I can’t see any reason it would care. If it’s survival in any way got in the way of keeping the house clean, like maybe it was starting to leak oil or something, I can see it recommending to its owner that it be replaced, and not out of a sense of duty, but of primal impulse.

          Put another way, modern computing devices are at least as intelligent as worms and jellyfish. Yet they show no inclination to act like these creatures. Survival and homeostatic impulses in life predate intelligence. Intelligence just enhances our ability to act on those impulses. Intelligence would enhance a machine’s ability to act on its programming, whatever that is.


          1. “Put another way, modern computing devices are at least as intelligent as worms and jellyfish.”

            None of which engender conversations about killing them. We’re talking about something smart enough for that to be an issue.


        1. Thanks for the video link[s], Wyrd. Very useful for putting vocabulary into this discussion. So there is a big difference between terminal goals and instrumental goals. Terminal goals are teleonomic, inherent to the mechanism, instincts. Instrumental goals are teleologic.

          So maybe suffering is when there is a perception that a measurement of the world is at odds with a goal, which triggers a desire to do something, but nothing can be done. Suffering would be the state of desiring to change something (anxiety), but being unable to do so.



          1. “Terminal goals are teleonomic, inherent to the mechanism, instincts. Instrumental goals are teleologic.”

            They seem good fits, as I read them, but that could be just my take on it. I don’t know that I would say terminal goals are inherent — twin siblings could easily have different terminal goals in life, for instance. To me teleology involves the intentional purpose of things, so I don’t feel it fits, either.

            But, again, just my take on it. (FWIW, I tend far more to be a distinguisher of words and ideas than a conflator. Many folks participating here seem more conflators than I am.)

            “Suffering would be the state of desiring to change something (anxiety), but being unable to do so.”

            To me suffering always has an existential component. It is, at some level, about death. There are lots of things in my life I would change if possible, but the inability doesn’t cause me, particularly, to suffer.

            That doesn’t mean there aren’t unchangeable situations that do cause suffering. I just don’t see it as defining.

            Liked by 1 person

          2. Wyrd,

            goals, intents, purpose, teleology, teleonomy, control, predictive coding, cybernetics, are all about the same thing. They can all be described in terms of the others.

            Re suffering: is my suffering from a paper cut really about death? What about suffering from chronic boredom?

            I understand there are things you want to be different that you cannot make different, but do you feel a pressing need to do something about those things? Do these things come to your attention repeatedly? It’s that pressing need (a terminal goal) that makes it suffering.



          3. “They can all be described in terms of the others.”

            Of course. The different words still have distinct meanings. That’s why they’re different words.

            “Re suffering: is my suffering from a paper cut really about death? What about suffering from chronic boredom?”

            Honestly, I couldn’t call either of those suffering. I reserve the term for things like starvation, prison camp, death of a loved one,… existential hurt and harm. Agony.

            “I understand there are things you want to be different that you cannot make different, but do you feel a pressing need to do something about those things? Do these things come to your attention repeatedly?”

            Um,… huh? I don’t know what you’re talking about. I was talking about words (wyrds!). As you might expect from my handle, I care a lot about words and their intentions and I think they should be used with precision.

            Given that, when people see similar things and conflate them as identical things, I just think that’s a kind of category error, that’s all.


          4. Wyrd,

            Forgot to mention twins. You said “twin siblings could easily have different terminal goals in life”. What kind of terminal goals are you referring to? A goal to become a doctor is an instrumental goal. A goal to become rich is an instrumental goal. So again, what terminal goals are you referring to?



          5. “A goal to become a doctor is an instrumental goal.”

            Depends on why you want to be a doctor. If you want to be a doctor to be a doctor (because you think doctors are way cool), that’s a terminal goal. If you want to be a doctor because doctors heal the sick or because doctors are rich, then, yes, it’s an instrumental goal.

            Whatever you want because you want it (and not as a means to some other goal) is a terminal goal. It’s those means to achieve your terminal goals that are instrumental to that.

            “A goal to become rich is an instrumental goal.”

            Almost certainly not (for most rich people, anyway). Unless you don’t care about the money but see it as a means to some other terminal goal.


          6. ”I reserve the term for things like starvation, prison camp, death of a loved one,… existential hurt and harm.

            That explains everything. You have a very narrow, and non-standard, definition of suffering. Most people would call that extreme suffering.

            ”If you want to be a doctor to be a doctor (because you think doctors are way cool), that’s a terminal goal.”

            Which is the terminal goal here? To be a doctor or to be cool? I’m pretty sure no-one has an innate desire to be a doctor.



          7. “Most people would call that extreme suffering.”

            There is a spectrum, of course. I can’t speak for others, but for me, paper cuts (let alone boredom, which is self-caused) aren’t particularly on it.

            “Which is the terminal goal here? To be a doctor or to be cool?”

            There are many ways to be cool; being a doctor is just one. The point is seeing being a doctor as terminal goal.

            “I’m pretty sure no-one has an innate desire to be a doctor.”

            That just doesn’t match my experience with people. I’ve met many who want to be a thing (doctor, lawyer, writer) because they view that as a great thing to be.

            My whole career turned on pursuing things I wanted to do because I wanted to do them. My first job in high school was in retail, but every other job I’ve ever had involved doing something I loved. So, of course you can become a doctor because you love being a doctor.


          8. But could you have “becoming a doctor” as a terminal (primal) goal if you’ve never heard of doctors? You might have other terminal goals, like “being admired”, “being helpful”, “living comfortably” and recognize that “hey, being a doctor will get me all those things”, but that doesn’t make “being a doctor” a terminal goal. It makes it an instrumental one.



          9. Yes, clearly. As I said originally:

            “Depends on why you want to be a doctor. If you want to be a doctor to be a doctor, that’s a terminal goal. If you want to be a doctor because doctors heal the sick or because doctors are rich, then, yes, it’s an instrumental goal.”

            And, obviously, you can’t have a terminal goal for something you’ve never heard of. Nor could it be an instrumental goal. Things you’ve never heard of aren’t much of anything to you.


          10. I haven’t watched the videos, so I may not be completely understanding what is meant by “terminal goal”, but can a volitional goal ever really be a terminal one? I know what I mean by “primal goals” means something visceral, instinctive, that can never be one the system comes up with on its own.


          11. “[C]an a volitional goal ever really be a terminal one?”

            As I understand it, yes, because it’s anything you want because you want it, not as a means to some greater goal. The terms, “terminal” and “instrumental,” really say it all.

            Terminal goals are your final, ultimate goals. Ones where you’d say, “Okay, I’m done here. I’ve accomplished what I set out to do.”

            Instrumental goals are anything you see as a means. Ones where you’d say, “Okay, that’s step X on the path. Now for step Y.”


          12. Although, in the context of AGI, it’s generally considered that an AGI would have fixed terminal goals provided by the basic programming, but would be determining on its own how to accomplish that task. That is, it would be setting its own instrumental goals.

            And we might not like those instrumental goals. Or, at the least, find them surprising. Which was my original comment.

            And note that this has already been observed in how DL NN solve chess, go, and video game, goals. Expert players of go and chess have mentioned how the systems made moves they never would have thought possibly fruitful.


          13. Right. But my point was that in organic systems, the terminal (primal?) goals would be instinctual primal needs. In that sense, no one ever just wants to be a doctor because they want to be one, they always want it in service of some need, maybe the need to be great, for esteem, or to have money. It might even have started to satisfy a particular need, but over time the primal need it’s satisfying changes, such that a person has lost conscious track of why they do what they do.


          14. “In that sense, no one ever just wants to be a doctor because they want to be one,”

            I think that’s an overly reductive view, because, as you suggest, it reduces all apparent terminal goals to basic primal ones. (“Primal” goals being a fine term for such deep-seated drives.)

            Doctor, lawyer, writer, programmer, are specific. Primal goals are general.

            Thing is, “terminal” and “instrumental” goals are terms of art used in AI safety research. At the level of conscious goals, they make a meaningful distinction between ends and means.


          15. Well, you know me. I reduce until someone cries foul, then I reduce some more 🙂

            But if a terminal value isn’t a primal one, then what about it makes it “terminal”? As I understand the terminology, if the goal or value is extrinsic, dependent on some other value or goal for its worth, then it’s an intermediary. Terminal goals, to be irreducible, must be intrinsic, a value onto themselves, held without reference to any other value. Or am I missing something?


          16. I think we get on the same page by seeing this as context-dependent. Terminal and instrumental are relative to the context. Also, the terms specifically apply to AGI development.

            At one level, primal goals are our innate terminal goals. For simplicity, let’s just consider one: “Be Happy.” But the logic equally applies to “Be Safe” or “Be Wealthy” or “Be Famous” or whatever.

            The equivalent AGI would be one programmed with the same general goal, “Be Happy,” and (exactly as you’re saying with us) all its other self-derived goals would be instrumental to being happy.

            At the level of house-cleaning robot AGI, with a terminal goal of “Be a Maid/Butler,” the equivalent human goal would be somewhat along the lines of having a career.

            The whole point of the terminology is to recognize that we will build AGI systems with built-in terminal goals, but the whole point of AGI is letting it determine instrumental goals.

            The further point being that we might find those instrumental goals not only unexpected (which we’ve already seen) but undesirable from a human perspective.


          17. Your examples of primal goals don’t seem concrete enough. Actually the name “primal goal” itself seems problematic. A real primal value would be “find food” or “avoid big unpredictable things”. In truth, even these phrases are summations of complex automatic reflexive processes. “Be Famous” is a itself an intermediate goal to achieve certain ends: secured food sources, protection, mates, etc, and “be happy” seems more like broad category of anything with a positive valence.

            This is complicated because we are a social species, and many of our primal values will be entangled with that, but it requires that complex associations be built over a lifetime for them to manifest. So a value of “avoid humiliation” is primal, but what triggers it is very complex and needs to be learned.


Your thoughts?

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.