10 December 2024

Disclaimer: The following opinions are my own and not necessarily the views of my employer.

Since the creation of primitive tools such as sticks and stones, mankind has always feared its own creations. Many great works of literature and cinema are built around this idea: Frankenstein, Jurassic Park, Westworld, and of course, Terminator. In our modern day, many people are afraid of the rapid advances in AI and ML, and rightfully so. AI represents a complete paradigm shift where humans no longer have a monopoly on artistic and intelligent endeavors. Are we doomed to suffer the consequences of our inventions, or is it possible for AI to stay under control and become a useful tool? I believe the answer once again, lies in the humble training of a child!

What do time-out, jail, and meditation all have in common?

They are all forms of limiting freedom. Let me tell you three separate personal stories of the course of this document which will elucidate this somewhat obvious fact.

When my older son Nolan (4 years old) misbehaves in the form of violent behavior such as hitting and throwing objects, we use the extremely effective punishment known as time-out. Nolan goes to his time-out chair, and he sits there until our kitchen timer goes off after six minutes. As previously discussed in The Great Learning Agents, time-out can be viewed as a sort of game where the only valid action is to do nothing. Any kind of attempt at escape or violent rebellion will only lengthen the stay in time-out.

Why do we really put kids in time-out?

We put them in time-out to disincentivize the bad behavior. Why do we want to disincentivize bad behavior? Because eventually children grow up and we want them to succeed in life, and also not to harm us, themselves, or others. Eventually, if I’ve done my job as a parent well, I expect the day to come where my son will be smarter and stronger than me. Whether due to his own accomplishments, or through my slow decline from aging, he will eventually have power over me, and I would like him to be a benevolent overlord. Perhaps you have seen this coming, but here in lies the bridge to our societal concern: if we believe AI will one day be smarter and stronger than us, perhaps the secret to creating benevolent AI overlords is already in front of us? We know how to raise good children, so I believe we know how to raise good AI.

Adults Go to Time-out Too

Oddly enough, time-out is not merely for children, but for adults as well. Adult behavior that is incredibly out of line with what we desire as a society results in loss of freedom. Adults go to jail, house arrest, and loss of other privileges.

Someone close to me who will go un-named went to jail a few years ago when his behavior got incredibly out of line. Strangely enough, this was one of the best things that happened to him in recent years. Since going to jail was a wake up call, he has been able to turn his life around: losing 70+ pounds of excess weight through healthy diet and exercise, and he has abstained from alcohol for the last three years as well. It turns out that jail can be an effective punishment as well for adults, because it gives us time to break out of the bad habits we have adopted, and gives us space and time to think about our poor behavior. I want to be abundantly clear that I don’t think prison is a magic solution that solves all bad behavior: I’m not enough of an expert to comment further on that topic.

I believe that we will need to put our AI agents into jail as well when their behavior gets out of line. In fact, this has already happened for years at this point if we think back to Microsoft’s Tay for example. Tay was effectively given the death penalty - never to see the light of day again. If we want to create long-running AI agents which will continue to learn day after day, I believe we need to lessen the penalty down to time-out levels and also we need to encourage AIs to learn self-control.

Beyond External Punishment: Self-Control

Meditation is like a self-enforced time-out. When our brains are not thinking straight and we have the awareness to recognize these ineffective thought patterns, many have adopted some form of mindfulness meditation. Setting a 5-minute timer and clearing our minds to intentionally think nothing has been shown to be an effective means of improving mood and essentially resetting our minds.

I’ve found this to be true in my life. When I’m getting angry for one reason or another, or if I’m simply having a bad mood kind of morning, I find that doing a 5-minute mindfulness meditation can really help me break out of my inner mental loops and get back to productive work.

Can AI agents learn self-control?

I believe so. I’ve recently been doing some research into Richard Sutton’s continuous deep learning, and I think for reinforcement learning problems, it would be really interesting to take some inspiration from children going to time-out. Say we as AI game developers have a large open-world full of many tasks and we want to deploy AI NPCs in these worlds. Assuming we are using Sutton’s techniques to maintain plasticity in our agents’ brains, perhaps we can train a meta-learning algorithm which will try to expose the learning agents to increasingly complex tasks. When the agents’ behavior gets out of line, we can send them to an in-game time-out where they cool off their learning rates, and they are forced to re-learn how to sit still and not take random actions for some period of time. Perhaps in this way our agents will be able to learn self-control and then we will finally be able to trust our agents not to harm us. I hope to have time to experiment with this idea, but perhaps others have already thought the same and can point me to work like this.

Just some food for thought. Thanks for reading!