Avoiding Catastrophe – Netopia Spotlight: Prof. Stuart Russell

Artifical intelligence is no longer the sci-fi future that we have so often used as a panel for projecting our fears and dreams onto. Today it is in every person’s hands (or device) as the raving popularity of text-generating AI-systems as ChatGPT or text-to-image-systems such as DALL-E and Midjourney have demonstrated. What does this mean for the impact of AI?

Netopia spoke to a true veteran and thought-leader in the field, professor Stuart Russell (read Netopia’s review of his 2019 book Human Compatible!) Abstract Intelligence – How to Put Human Values into AI – Netopia Netopia

Professor Russell came to Stockholm earlier this month for the Nobel Week Dialogue and further discussed these topics on two panels on the program, watch them here (starting at 4:30:00): Nobel Week Dialogue – NobelPrize.org

WATCH THE FULL INTERVIEW

Professor Stuart Russell, Welcome to Netopia’s Video Spotlight interview.

It’s nice to be with you. Thank you.

And we are in Stockholm today for the Nobel week dialogue and you have a very impressive resume that you’re a professor of neurological surgery. I was a professor of neurological surgery. for just three years while I was working in a research project. And also computer science has been a focus?

 Yes, computer science has been my day job, And I’ve been in UC Berkeley for 36 years now. And that’s an interesting combination of neurobiological surgery and computer science. The mind leaps to eurological interfaces that connect to your brain.  You might think, but actually, it was a coincidence that some of the basic mathematical ideas that I had. I thought might be useful for some of the problems that come up, really just in keeping people alive in the Intensive Care Unit. So, when you’ve had a head injury… often your brain is unable to regulate your body. And so the Intensive Care Unit is there to do it instead of the brain right to keep you a temperature in the right range. To keep your heart rate, your blood  pressure, your oxygen levels, it has to manage everything and do that. It collects a lot of information. So a patient in the Intensive Care Unit is plugged full of sensor devices that are measuring all these things so you can know when you need to fix it. But it’s very hard for human beings to keep track of all that data. So we thought that we’d be able to use AI systems to watch the sensor values and then determine as soon as possible if something was going wrong and then intervene early and more effectively and It turned out that yeah, we could do that a little bit, but the human body is a very, very complicated thing. And so I think we just scratched the surface of that problem.

That’s really interesting….. It’s the subconscious operations of the body rather than mimicking the mind because that’s something we often think about when we talk about Artificial Intelligence.

That’s right. So the connection is really a coincidence. It’s not, I wasn’t trying to understand how the brain works. It was just trying to stop people from dying but I know much more about the plumbing of the body basically.

You are here in Stockholm. Now for the Nobel Week dialogue and you shall be speaking this afternoon. What’s your topic today?

So there are two panels. One is on living with technology so I’ll do an introduction about artificial intelligence. What’s Happening Now? What are the trends? What’s going to be the big thing inthe future? The second panel is on how to avoid catastrophe , which happens to be what I’m working on for the last seven or eight years. I’ve been thinking about what really one main question Which is: if we build machines that are more powerful than human beings, how do we have power over them forever?

I’ve been thinking about what really one main question Which is: if we build machines that are more powerful than human beings, how do we have power over them forever?

So, that’s the question what I’ve been asking. And so it’s led me in some very interesting directions, including a realization, that actually we really got the field wrong right from the beginning.

How so? So, the way we, the way we thought about AI….was we started doing AI roughly in the 1940s and so  obviously it’s about making machines intelligent. The question is: what does that mean? Does it mean just that they wrote beautiful poetry or, you know, in some people thought, oh, it means that they have to behave just like human beings, right? But that’s really a sort of question of psychology and humans behave in ways that are sort of a lot of accidental results of evolution of structures of our brains and bodies and so on, you can’t really build a mathematical discipline out of that. So the definition that won out was a definition. That we borrowed really from economics and philosophy, the notion of rational behavior, the notion that are our actions can be expected to achieve our objectives. And obviously, if you take an act that you don’t expect to achieve your objectives, then it’s not rational, It’s not intelligent to act in ways that are contrary to your own interest, so that’s the model that we borrow. Right, and for humans, that makes sense because we come in and we have our objectives by, for whatever reason. There are things we want our future is to be like the things we don’t want our futures to be like, but for machines they don’t come with objectives.

So the model that we developed was you make objective of achieving machinery or as we call optimising machinery. And then you have to plug in the objective, right? And so, in the early days of the field, those objectives were logically defined goals. Like, you know, I want to be at the airport before 2 p.m. right more recently. We understand that there’s uncertainty, we have to do with trade offs. So we have a more, a richer notion of what we mean by objective, but the same principle that our actions should be expected to achieve our objectives. And the same for machines…the problem with that model which for some reason we just didn’t notice until recently is that if you if you put in the wrong objective, Then you have a problem right now, you’ve got a machine that’s pursuing an objective. That’s actually in conflict with what you, the human want the future to be like, right? So you’re really setting up a war between humans and machines. Well that’s exactly what we want to avoid.

Then you have a problem right now, you’ve got a machine that’s pursuing an objective. That’s actually in conflict with what you, the human want the future to be like, right? So you’re really setting up a war between humans and machines. Well that’s exactly what we want to avoid.

So one answer might be okay. We just have to make sure that the objective we put in is exactly right. Yes, That it’s complete that it’s corrected covers all conceivable, human interests, no matter how the future actually evolves. And that’s completely impossible because there are there are things that are going to happen in the future that we don’t yet know whether we’re going to like them or not, right? So, The answer seems to be get rid of that model all together, Get rid of the model that we build objective of achieving machines and we put objectives into them. So what we do instead is build machines that know that they don’t know what the real objectives. So they’re actually uncertain about what it is that he would want. Even though their goal is to help humans, get what they want but they don’t know what it is. So that’s a new kind of program where we didn’t have those kinds of programs before and it actually leads to all kinds of desirable behaviours, Because if the machine knows that it doesn’t know what the true objective is then For example, it has an incentive to ask permission. Before doing something that might violate…. Some of our objectives, our preferences, right… in the old way of doing things. There’s never a reason to ask permission because the machine has the objective. That’s what it has to pursue those the right thing to pursue it, right? And so it never ask for permission, right? So so it’s early days and there’s a huge amount of work to do, but I’m reasonably optimistic that this way of thinking about a I will actually turn out to be better and maybe we’ll solve this problem. This long run problem of how we maintain power over machines. So it’s an “Artificial Doubting machine”…. it’s a “Humble Machine”.

Since you wrote the book that we reviewed a few years ago there has been a big change that AI has become something in every man’s hands now with the mid-journey and Dall-E and the issue of creation, artificial intelligences, and also chatGPT, very popular as we speak and it’s all over social media. Yeah, did you expect this to happen? This democratization of it. …of artificial intelligence tools and what’s the impact?

So it’s interesting that you bring up these two examples, the second one, chatGPT is very much along the lines that people have always written about in science fiction. If you think about on Star Trek, you know, the computers there. You could talk to the computer, ask it questions. It gives you very knowledgeable answers; you know, some of the early real AI systems even in the late 60s, were question answering systems. You can ask your questions in English and it would answer you in English and interestingly

ChatGPT is not able to do some of the things that those systems were able to do in the late 1960s

ChatGPT is not able to do some of the things that those systems were able to do in the late 1960s. So for example, in those systems the most famous being system called SHRDLU by Terry. Winograd, the conversation was about a simulated world with where you were moving things around. The table and you could say to it, okay, put the red block behind the green pyramid then you could ask questions. Like well, what’s in front of the green pyramid? And it could tell you, whereas chatGPT very quickly gets confused and you can’t answer those kinds of questions in house.

Abstract understanding of the outside world is that right?

That’s right. So it can’t build and maintain an update correctly, a model of what’s happening in the world. It does some other things, really very impressively but those kinds of sequential tasks, not so much but I think that’s probably not a Time. The other kind of system…

We hope for the best. We have absolutely no idea how the systems do what they do. We can’t predict when they’re going to work, when they are not going to work sometimes they answer questions correctly.

this idea that you could put in some text and then it will produce a picture for you….. And I was I was giving a speech in the House of Lords a few weeks ago. So I just had to have some fun and put in “Members of the House of Lords wrestling in the Mud” And this was on stage one of the stable diffusion systems and It produced a really quite impressive picture ….of you know elderly gentleman wearing long robes covered in mud. It was quite funny but that was never a goal of AI, right? Right. That it just wasn’t something that people worked on it. Just turned out by serendipity, People realize that, yeah. If you train with both text and images, you can get generative models. And I think it came, people found ways of generating images, if you train it on, lots and lots of faces and with a certain kind of technology called Generative Adversarial Network or GAN, Then you can ask that model to generate new faces. It’s very good at that. But then they just realized if you train in parallel with textual descriptions and images, then if you ask for text, you can it produce images. So, a completely new functionality, That wasn’t ever really seriously pursued in AI until very recently, So it’s been a very fascinating period and the kinds of things that are going on in AI…. …they just don’t resemble anything that we did historically in the terms of the methodology, The early question and answering systems that I described from the late 1960s underlying it there was a logical reasoning system with a database and then we would take a natural language, we would find the structure of the sentence we would convert it to into an internal formal representation interface that with the reasoning system and so on. Now we just make basically a big pot of circuit, you know, billions and billions and circuit elements. That are just tunable and we just train it on trillions of words or text. We hope for the best. We have absolutely no idea how the systems do what they do. We can’t predict when they’re going to work, when a not going to work sometimes they answer questions correctly. Sometimes they just output complete nonsense. Any but you know, one of my friends with just sending me examples, he was trying chatGPT, He was asking okay: which thing is not bigger than the other an elephant and a cat and GPT. Confidently says: “Neither an elephant nor a cat is not bigger than the other.”

You speak so fondly of the artificial intelligence almost like we talked about our children or our pets and at the same time, some people think of it as the end of humanity….

Well, I think you can simultaneously enjoy both pictures. I mean the things we have now, in many ways, they’re amusing toys and in some sense, they are like animals in that.

We use dogs for hunting, we use horses for pulling carriages around, and we’ll find ways to use these chat systems.

They are the result of really a sort of process of natural selection to process that, you know, is a stochastic gradient descent algorithm which is sort of what natural selection does and it has some other things to it too. But that process, that sort of; it’s almost like a certain chemical reaction …it’s just sort of like throw lots of stuff in… let it boil for a while and see: you know, maybe it’ll turn into a cheesecake, or maybe something else, right? And it just turns into this thing and you don’t know how it works. And so, you just play with it and you learn what it can do and can’t do like with a cat. We learn that cats. Don’t come when you call them … A can of cat food, then they come. Some things cats can do, some things dogs can do, We just sort of learning…. This is almost like a new species will just learning what they can and can’t do and how how to use them. You know, we use dogs for hunting, we use horses for pulling carriages around, and we’ll find ways to use these chat systems, Well the new generation systems. Something that’s actually much more capable than the old chat was.

As humans think we tend to project things like emotion and intention on living things and and objects…. of course, also on AI. …as I was preparing for this interview, I thought about the old robot dog. AIbo…it was. That’s right. Sony AIBO and it was like a small puppy dog, and it acted like a puppy floppy ears. Do you think that they would be point where, we get a perfect puppy? Or is there something intangible? Something like…. Life or soul… that… In principle we could do that. Whether it would make sense, I’m not sure. And it seems quite likely that the natural direction of technology would take us. In different directions, right?

So, the probabilities given that machines are so much faster than biological frames and as they scale up, they bigger memories. They have much more communication bandwidth with each other, right, they can exchange information, far faster than me humans. Can exchange information with each other. So they’re just going to look very different for biological systems. I think. And I would say the jury is still out on which technological approach will end up working. I know there’s a lot of excitement in our around deep networks and large language models, which chatGPT is an example. But there are reasons to think that those approaches will fail in the end. And we’re already seeing the ways that they don’t work as well as you would like, in the sense that they seem to need far more data than humans do around right. ChatGPT has already read possibly millions of times more text than any human has ever read me. And yet they still get very simple basic questions wrong. The image recognition systems need to see thousands of millions of examples of a giraffe, right? But if you get a picture book, read to your child …you can’t buy a picture of the million pages of giraffes!

The image recognition systems need to see thousands of millions of examples of a giraffe, right? But if you get a picture book, read to your child …you can’t buy a picture of the million pages of giraffes!

is one giraffe and it’s really a simple, you know, yellow and brown cartoon giraffe and that’s enough for that child to recognize giraffes in any context, anywhere in the world for the rest of their lives. From one example of the human learning is much more capable. and I think that illustrates that there are basic principles that we haven’t yet succeeded in capturing in our approaches to machine learning.

I think that we have tended to think of AI as something foreign or something… that comes around the corner someday and maybe it’s been a topic of science fiction. So, back to the democratization, does this change the relationship between us humans and artificial intelligence and the expectations. we might have on it now that we can more easily interact with it?

I think it does. And I think the point made earlier that we probably overestimate its intelligence and whether it’s at, actually reasoning or or even remembering. So it’s it’s very hard to remember that something that is able to generate grammatically correct and coherent text could be doing that using completely unintelligent principles. But that can certainly be done, right? And there are many examples as one of my favourite on the web is called “the Chomsky bot” and so the “the Chomsky bot” is a very, very simple statistical text generator. That was trained on a lot of pages of the writings of Noam Chomsky and it produces paragraphs and you know they’re very coherent you day. They’re very characteristically Chomsky and very complicated sentence structures and complicated logical relationships. Among all things in. If you just ask it to speak or to write a few paragraphs you think “Oh my goodness”…. This is amazing. You know this program is so brilliant but actually if you keep doing it it starts to get repetitive right and then you realize that you could start to see, okay? How is it making its really a party trick… and the large language models, chatGPT, and others are a really more sophisticated versions of that, what the response that they’re giving is in some sense a statistical, average of the kinds of responses that humans have given in to those kinds of inputs. It all the text that the system has as ever, written. So simple example would be if you ask it how are you today? Right? Well, what’s the most common answer? In history to the question. How are you today or? I’m fine. Thanks. How are you, Right? So horribly it says, unless it’s been specially prepared ground to avoid it. It will probably say “I’m fine. Thanks. How are you?” Right!? But that doesn’t mean that it’s fine, it’s actually just parroting what humans say. And it’s the, it doesn’t have any sense that…. Yeah, it exists or that it could be fine or not fine. Or even that the word fine doesn’t apply because it’s a machine. It’s just parroting. So if you just keep that example of mind, right? But it’s not answering your question, It’s been then that the helps to dispel the illusion on the other hand. It makes you wonder, is that what human beings are doing? Most of the time right there? We’re not really doing all the reasoning and thinking remembering that a lot of the time our speech is generated by pulling together patterns that we’ve seen in the past or even things we’ve said in the past. And I can tell you, that’s what I’m doing right now

And sometimes we lie, maybe we are not fine We could just don’t want to talk about it.. Okay, last question. It also appears that input data to AI is a field of a power struggle on many different levels, we have big companies investing in artificial intelligence systems and trying to get access to as much data as possible to train them?

We have super power States doing similar things. How do you see this? this playing out, will the benevolent forces stand tall in the end.. or is this part of the dystopic?

You can be very intelligent without a lot of data humans. You know that, for example the amount of text even in GPT3 which is not the latest generation, right? We’re about to see GPT4 but the amount of text that GPT3 was trained on is roughly the same size as every book ever written.

Well, I actually think that at least I hope that this data race is coming to an end, you know, that this idea that data is the new oil. And the more data you have the more power your systems are going to have and whoever has the most data wins. Will I think that’s a horribly, an incorrect narrative? Because getting more data doesn’t necessarily result in more intelligent system. I think there, there are basic research advances that are going to be determinative of, who is who creates the first real general-purpose AI systems. And one thing that is obvious from looking at humans is that you can be very intelligent without a lot of data humans. You know that, for example the amount of text even in GPT3 which is not the latest generation, right? We’re about to see GPT4 but the amount of text that GPT3 was trained on is roughly the same size as every book ever written. So you’ve already pretty much consumed most of the text in the world, right? You know what else is there was a bunch of on the web, a lot of that text was generated by computer programs spitting, out instructions and news items and things like that, that are machine-generated. So it’s not clear. That adds a lot in terms of creating more intelligence. So I think that we are coming to the end If we haven’t already come to the end, it was soon going to …of the idea that we can create more capabilities simply by having bigger circuits training with more data and this is an opinion, I should say is not a theorem, I can’t prove to you what I’m saying. It’s a gut feeling and other people have a different gut feeling. They feel like, well, would you just get 10 times more data? Ten times bigger circuit something qualitatively new is going to happen but that’s just what you know. That also feels like wishful thinking to me why there’s no scientific basis for that because they don’t even know what happens to produce a qualitative change in behavior.

Thank you Stuart Russell. Thank you so much for coming to the Netopia and giving this interview and good luck with your talk this afternoon.

It’s a pleasure to be nice to speak to you.

 

Footnote:
To the reader, it’s with no irony that this article was created via speech-to-text AI recognition software. Almost all video upload services today have some form of extraction of audio to text. Perhaps this is for the user output, perhaps for advertising input or the positive output being for hard of hearing and deaf users who can avail of the content with subtitles.

In all, we’d rate it as 9/10 for accuracy from input sound to output text, though in all fairness it helps when the input language is in English and the speaker is a Professor using clear and ordered language and sentences.