Robot masters new skills through trial and error

Related to our various AI discussions, I noticed this news: Robot masters new skills through trial and error — ScienceDaily.

Researchers at the University of California, Berkeley, have developed algorithms that enable robots to learn motor tasks through trial and error using a process that more closely approximates the way humans learn, marking a major milestone in the field of artificial intelligence.

They demonstrated their technique, a type of reinforcement learning, by having a robot complete various tasks — putting a clothes hanger on a rack, assembling a toy plane, screwing a cap on a water bottle, and more — without pre-programmed details about its surroundings.

…In the world of artificial intelligence, deep learning programs create “neural nets” in which layers of artificial neurons process overlapping raw sensory data, whether it be sound waves or image pixels. This helps the robot recognize patterns and categories among the data it is receiving. People who use Siri on their iPhones, Google’s speech-to-text program or Google Street View might already have benefited from the significant advances deep learning has provided in speech and vision recognition.

If the robot learning concerns you, if you’re concerned that it, or more likely one of its successors, might bootstrap itself into Ultron or Skynet, then consider this part:

The algorithm controlling BRETT’s learning included a reward function that provided a score based upon how well the robot was doing with the task.

BRETT takes in the scene, including the position of its own arms and hands, as viewed by the camera. The algorithm provides real-time feedback via the score based upon the robot’s movements. Movements that bring the robot closer to completing the task will score higher than those that do not. The score feeds back through the neural net, so the robot can learn which movements are better for the task at hand.

Obviously, what direction the robot learns in, and what it will do with what it’s learned, will be heavily influenced by its reward system, in other words, by its programming.  (Just as what direction we learn in is heavily influenced by the gene propagating reward system evolution programmed into us.)

26 thoughts on “Robot masters new skills through trial and error

  1. If we can program the robot to have a reward system, it is likely a small step for the robot to learn and modify that system. Then you’ll have fully autonomous machines. It will be interesting what rewards it will favour.

    Liked by 1 person

    1. That’s an interesting point. If we give it the ability and motivation to modify the reward system (we don’t have to), I would think which rewards it favors would be related to its existing rewards. Maybe we would give it the ability to craft intermediary rewards in service of higher priority long term rewards.

      It’s worth noting that our ability to modify our own reward system is very limited. For example, I wish I could make myself enjoy sweets and caffeine less, or enjoy exercise more, in service of the reward of being healthier.

      Liked by 1 person

      1. If it can modify the reward system, it would modify it so that it gets rewarded for tasks that it find easiest to complete. It would thus evolve laziness, and I bet that an AI might one day even beat humans at this skill.


  2. A “reward system” presumes a robot that needs rewards. The key distinction between animate and inanimate is that life comes with needs. The needs motivate animation toward satisfying them.

    With self-awareness (a self-monitoring loop of some maximum iterations) and some equivalent to our biological needs, the robot will have something worth thinking about.

    I remember ages ago reading about this little robot on wheels that had a single function: find a socket to plug into to recharge its batteries. Life-form? Well, maybe after we give them the need to build baby robots.


    1. Excellent point. A robot’s fundamental needs are, of course, what needs we give it.

      What would make it a lifeform? Self concern and self replication seem like good criteria for me, although “life” is a hazy concept to begin with.


      1. Needs are the difference between the animate and the inanimate. An amoeba extends a pseudo-pod, a tree sends roots into the ground and leaves into the sunlight, a newborn cries out for food and warmth. The point of animation is to satisfy needs.

        That’s also the basis of morality. We call something “good” if it satisfies a real need we have as an individual, as a society, or as a species. Morality is species-centric. What is good for the bacteria may be very bad for the human.


        1. I remember commenting to someone one day that the office refrigerator had so much neglected and forgotten rotting stuff in it that when you opened it, you experienced the smell of death. The other person pointed out it was actually the smell of life, of bacteria feasting. (Although, now that I think of it, it was technically the smell of bacterial waste product from that feeding.)


    1. One of the reasons sci-fi author Greg Egan gives for the utopian view of post singularity society in his stories is that we could modify ourselves to remove our worst impulses.

      Of course, as Steve points out, there’s always the possibility that some of us will modify ourselves to be more ruthless and aggressive.


  3. Cool robots! I hope they make some that will clean my house. That would so freaking awesome. If they can already put clothes on a hanger, it seems like they’re not too far away from doing my laundry. I’m getting excited already. I wonder if I’d be able to teach it…you know, to speed up the process so I don’t end up with clothes in the toilet.

    Such a robot really is exciting and would be life changing for people with disabilities. I wonder if what the robot learns has to be programmed, or if it can learn to do any task? For instance, does “hanging clothes” have to be individually programmed?


    1. I think the whole purpose is for them to be able to learn those things. Of course, very basic things like hanging clothes will hopefully be in their repertoire before they are delivered. But learning where your clothes are and where you’d like them hung would be crucial.

      I’m definitely hoping they have these guys ready by my old age feeble stage.

      Liked by 1 person

      1. I hope they’re around soon too, but I get the feeling it will be a long while. Imagine how much it would cost. On the other hand, if we could get one that did household labor and nursing, it might be cost effective when you consider the cost of nursing homes/assisted living.

        But it really does seem like we’re far away from that. I have yet to buy one of those robotic vacuums. For some reason I’m skeptical that they’ll really do the job.


        1. You might be surprised how fast things change. I’m thinking five years after they hit the market, it’ll be mandatory for insurance companies to pay for them.

          I’ve thought about picking up a Roomba, but I suspect the clutter in my house would defeat it. (Or cause it to decide that I must be eliminated.)

          Liked by 2 people

  4. In past discussions I’ve had the impression you favored out-of-the-box implementations of AI (created through observation, analysis, and synthesis) over systems that work more like human brains do — a basic system that needs training. Have I misunderstood your position?

    In any event, bit of synchronicity here. I read an article just the other day about deep learning systems. There is an apparent advance in such networks that’s related to renormalization techniques used by physicists:


    1. Hmmm. Well, I’ve said that it makes sense for AIs to come with more innate knowledge than, say, a human newborn. But I still want to them to be able to learn. I’m sure the exact ratio will change depending on the engineering purposes.

      Thanks for the link! I saw that article. Renormalization seems like the intuitive way that patterns are stored and recognized in the brain. I also think it’s why memories tend to get modified over time as base layer patterns underlying a biographical memory get changed with new experiences. But working out the mathematics is cool.


      1. Yeah, it is. If you read all the comments, it appears there are detractors, but (A) it’s the interweb and there are always detractors, so (B) I’m not sure how seriously to take them. Quanta Magazine does seem a bit “Oh, Gosh!” at times. [shrug]

        Liked by 1 person

  5. The scary “Skynet” scenarios come from the idea of complex goal-oriented networks (such as discussed in this post) given goals such as “protect America” and which — lacking human values — decide we’re best protected if all our enemies are dead. Or a scenario that goes back at least as far as Asimov, that a “protect humans” goal is best served by removing all our freedoms (a goal even humans attempt to implement sometimes).

    That suggests autonomous systems require human values on some level if they are truly autonomous and self-learning. And that raises the issue of how we control those values and what else might arise. The linked article mentions a scenario where a household robot given the goal of protecting and feeding the children discovers there is no food in the house… and no one thought to include the idea that the family cat is not food.

    The thing about goal-oriented systems isn’t the goals so much as the constraints on those goals! Ultimately we find that we need to limit the capabilities of such systems (which reduces their value), or we need to be very, very careful!

    (Another bit of synchronicity here. The other night I read a short story by David Brin in which humans determined that AI needed to be created at “adolescent” level and schooled by education and parenting to ensure and instill human values. Of course, they grow up a lot faster than any teenager. 🙂 )

    ((That same short story touches on a strong version of the Fermi Paradox involving Von Neumann probes. Even at our current stage of space exploration we’re sending out robots, and that seems a very effective way to explore space. Von Neumann probes could explore the galaxy in a million years or so easily. So… where are they? The usual answers apply, but in this case, so does one more: AI turns out to be an unworkable solution for some reason. That seems unlikely given even our current level of development, but since the most likely way to explore space is through AI, it’s apparent lack is interesting.))


    1. On Von Neumann probes, it could also be that human sapience level AIs requires a substrate that couldn’t handle interstellar space travel any better than human bodies do. I agree that it seems unlikely though.

      We also can’t rule out that there aren’t alien Von Neumann probes somewhere in the solar system, or are even here on Earth, but microscopic. But I think the probability of that is low.


      1. Yeah, agreed, microscopic sounds unlikely. The Brin story posited a galaxy filled with Von Neumann probes from many different civilizations with radically different goals (destroyers, harvesters, seekers, greeters, et many cetera). Millions of years ago they warred. The story involves the broken, feeble remains of a number of benign ones hiding out in the asteroid belt. Humans have just discovered these remains as they begin to explore the belt.


Your thoughts?

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.