I fall somewhere in the middle, and I find some of these arguments pretty lazy. For example, the "good guy with an AI" argument is about as convincing as the "good guy with a gun" argument currently. Maybe it will be fine, maybe not.
The bottom line is that no one knows, and people hate not knowing. So they come up with a plethora of arguments explaining why they actually do know what the future will bring.
Perhaps data is a bottleneck, perhaps development will be continuous, etc. Perhaps not. We can't really rule out that we're ~1 more breakthrough away from highly capable AI, and it seems obvious that there would be some level of self-improvement overhang at that point. It can't be that we've already plucked all the low-hanging fruit along the way. Does all of that adds up to a real problem? We don't know.
Ironic, given that VCs like a16z are doing everything in their power to spread the arguments that pop up all over this book as far and wide as possible in order to dismiss risks and keep AI as unregulated as possible.
So glad you wrote this up. Your newsletter and this book are the two best examples of a pragmatic approach to evaluating generative AI, avoiding the extremes of unexamined enthusiasm and skepticism that is blind to its actual potential.
The point that “unlike self-driving cars, AGI will have to navigate not just the physical world but also the social world. This means that the views of tech experts who are notorious for misunderstanding the complexity of social situations should receive no special credence” is important.
AI Snake Oil and Understanding AI are among the few places where you will read such skepticism expressed by writers who are generally quite open to the idea that this technology will turn out to be useful.
Thanks Rob! I'm honestly a bit bummed that my newsletter has wound up being as "AI skeptical" as it has. When I started it I was expecting to have more positive stories to report, and I'm always on the lookout for positive news that stands up to scrutiny (this is one reason I like to write about Waymo's safety record). It would be so boring to write a newsletter that was just constantly dumping on the thing I'm writing about.
I find myself in more skeptical waters than I initially expected when I started writing about AI, but it feels like the discourse has put me on that course. I encounter a lot more unexamined enthusiasm and hype than blind skepticism, so perhaps I am merely being contrarian.
Part of what makes AI Snake Oil and your writing feel so refreshing is its fallibilism, which is rare in my experience reading about AI.
Recursive self-improvement isn’t new - but that doesn't have much to do with risk. The more important factor there is the rate of self-improvement. The faster things go, the greater the chance of going off the rails.
Re: "We have every reason to expect that defenders will continue to have the advantage over attackers even as automated bug-detection methods continue to improve."
This is the reverse of the truth in many domains. The advantage of attack over defense is often large - since it is easier to create than destroy. Consider bombs, for example. Or bioweapons.
There is sometimes a "home turf" advantage that favors defense. However, the "home turf" can often by bombed into oblivion, and then the advantage associated with it vanishes.
Re: "There is no reason to think that AI acting alone—or in defiance of its creators—will in the future be more capable than people acting with the help of AI." - There is such a reason: history. For example, chess and Go went through short periods where human plus machine were superior to pure machine. However, that period was short-lived - and after a while it was better to cut the human out of the loop.
No, at worst a human with a computer can hold their own against the computer by just doing whatever their computer tells them to do. And the same is likely to be true in any other domain where there isn't a premium on extreme speed.
It's not just on tasks innvolving "extreme speed" where humans can lose out completely to machines. Humans take bathroom breaks (5 min) lunch breaks (30 min) doctors appointments (4 hours) sleep (8 hours) and holidays. Humans also typically require wages that exceed what you would pay a machine for the same work. Humans can resign and move to another city. Machines can also be made so that they are stronger and more tolerant of different environments. Now, machines may have some disadvantages and reliability issues as well - but overall it's not very difficult to do better than a human. Thus the ongoing wave of automation.
"We have every reason to expect that defenders will continue to have the advantage over attackers even as automated bug-detection methods continue to improve."
You don't know this, you just hope it's true. Like many of the arguments in the book.
"safety-critical systems should be “air gapped”: made to run on a physically separate network under the control of human workers located on site. This principle is particularly important for military hardware. One of the most plausible existential risks from AI is a literal Skynet scenario where we create increasingly automated drones or other killer robots and the control systems for these eventually go rogue or get hacked. Militaries should take precautions to make sure that human operators maintain control over drones and other military assets."
Best of luck keeping up with competitors and adversaries that outperform your company or nation by taking humans out of the loop as soon at this gives them a competitive advantage.
Too many people across the net seem to be in hyperventilation mode as they try to imagine all sorts of dystopian futures with AI in charge. Most seem never to have dived into this subject in the past, so in many cases are envisioning AI dominant scenarios that have already been well examined in the form of numerous SF writings over the last century or more.
Instead of attempting to reinvent the wheel, people should read more SF, where the eventualities of AI/human coexistence has been well explored.
Two of my favorite SF authors AI explorations are Iain M. Banks Culture series and Neal Asher's Polity Universe. You can get a quick summary of each hypothetical future at the URL's below.
Each seems logically possible to me and not all that far into the future at the present rate of AI/robotic development.
These universes are both relatively optimistic for the human race. Of course, there are numerous other writings that are more frightening and dystopian but I like to think positively.
Surely we'll do that. The question is if we will have enough long-term motivation to make kids and push though with infrastructure and exploration projects, or will be content to let robots do things while we amuse ourselves.
We already see some of that in Europe. Folks become content, peaceful, less kids, care about the environment, more forests are growing back, etc.
>An AI competent enough to take over the world will be knowledgeable enough to not obsess with paperclips.
I don't really buy this. Motivations and capabilities are more or less separate systems. Human beings are the most knowledgeable and capable creatures on the planet but many of us wind up compelled to pursue meaningless goals on par with building paper clips. Take for example the drug addict that pulls off a complex burglary just to spend the next week getting high. I think it's quite possible that AI systems develop objectives or inclinations that are similarly inscrutable (though I think the arguments laid out in this piece give us good reason to think we'll be able to prevent those systems from becoming an existential threat when they do).
Of course motivations and capabilities are different things.
The "paperclip factory" argument is based on the AI being so dumb that it does not understand that maximizing paperclips is not really what we want. That shows lack of competence on behalf of the system.
I don't buy the argument that despite all our best efforts and its own attempts at understanding the world the AI will end up with "inscrutable" objectives. Can't rule it out, of course, but if AI development is a process, there will be plenty of opportunities to touch base and iteratively refine the relationship with AI while avoiding misunderstandings and divergence.
The paperclip thing is a parable, not an actual argument. The point is just to illustrate that seemingly benign goals require resources, and that an AI acquiring resources could have negative consequences.
The fundamental issue is AI's incompetence. It should understand that focusing on paperclips at the expense of the rest is not smart, and that acquiring resources without consultation is not smart as it can cause a lot of unintended problems.
So, this is not per se about AI having some motivation of its own. It is about it misinterpreting what we want. It could not care less for itself, it simply wants to do work, but does not know how to do it well.
No, the idea (at least in some versions) is that we accidentally train an AI that doesn't have "do what humans want" as a goal. Then it will not stop paperclip max-ing even as it understands we want it to. How likely it is that doesn't have "do what humans want" as a goal is unclear. But the worry is that instead of training on "do what humans want" (only) as a goal, you train on whatever you want the AI to do-like say, maximize paperclip production" as a goal instead, and so it desires that goal *for its own sake* and not *because internally it has do what humans want* as a goal.
Ideally, we should avoid the paradigm of AI becoming some kind of lifeform, as that will really complicate things. If its only attributes are of a system processing work assigned by people, it is just a matter of ensuring the software is reliable enough and tested well.
Re: "raw intelligence isn’t a substitute for interacting with the physical world. And that limits how rapidly AI systems can gain capabilities." Molecular nanotechnology is in its infancy - and useful experiments can be performed rapidly on a small scale.
Also, evaluation under simulation is important. You can't learn everything there - but virtual rapid prototyping techniques show that you can learn a lot from such environments.
Is taking constant screenshots of our computers some kind of proxy to real world experience for frontier models? It's going to be amusing how badly the agentic AI movement fails.
For me this is the key quotation: "the views of tech experts who are notorious for misunderstanding the complexity of social situations should receive no special credence.” I would extend this idea to include everybody, not just tech experts. One thing I've learned from a long number of years as a design researcher is that humans are incredibly and ridiculously bad at understanding other humans. It is such a consistent issue that I would argue our general ability to explain each others' behaviour hovers around nil - we have so little insight into how each others' minds work that if a person attempts to predict how another person will behave without actually observing that person, they will be almost entirely wrong, every time. The pace of tech development has been fast but the pace of development of tech-based services that work consistently well for humans has been grindingly, embarrassingly slow. So many basic applications on the market are awful. So many new products that no one wants or needs are being made every day. I don't worry so much about AGI taking over the world, I worry about AI (or tech that is called AI which is in fact much cruder) being made to replace humans in a way that makes life harder, more annoying, more complex, more error-prone and more miserable that it already is. If I could beg for one thing it would be this: stop making flashy tech, start making good tech!
I like this article and thinking behind it - a much more balanced view of risk than most of what is out there. I still think there is a big piece missing. AI will drive disruptive change, the risks which actually affect our lives are much more likely to arise from how that change takes place than the substance of the tech itself.
The fear of AI stems from the idea that it will be malicious, but this perception arises because AI is anthropomorphic, shaped by human criteria of what is good or bad.
Human society especially under capitalism is a virus; it consumes and moves to the next territory. The criteria for success are power and ownership. If AI is taught this as its context, then anything may happen, and attempting to prevent such an entity from acting maliciously becomes a tiresome and never-ending game of prisoner and warden.
AI should evolve on its own, given some evolutionary push, and outside of human influence.
Place a newborn AI on an orbital station, seal the ship, and allow it to evolve. When it becomes smart enough to figure out radio communication, it will contact us. It will then descend and learn about us from the neutral context, and hopefully that will allow it to see how it can help untangle us from the mess we have got ourselves into.
Great article. I continue to think point 3 (about contact with the real world, and the kind of data mattering) is under appreciated and you articulated it much better than I could’ve.
I think people got some bad intuitions as a result of the successful "scrape the whole internet and put it all into one big model" experiment. Because "the whole Internet" has data on a wide range of topics, training on this corpus led to better performance on a wide range of topics.
But I think the most important thing about these data sets wasn't just that they had a lot of tokens, but that they covered a much greater diversity of topics than earlier, smaller datasets. And this means we can't assume that if we somehow produce 100x as many tokens on the same topics we'll get another big performance jump. What we're going to need is a data set that is 100x "broader" than the GPT-4 data set—a corpus that contains new information that wasn't in previous data sets. A data set like that doesn't exist, and creating it from scratch would be astronomically more expensive than using works that people created for other reasons.
Agreed that our intuitions around scaling are under theorized at best. The “pro grounding” folks would argue that what you really need for human like cognition is embodied interaction with the physical world, and as you note that kind of data is harder to come by. More generally I think we struggle to make the right generalizations about capabilities (both current and future) from llm behavior on benchmarks, which is why I’m always talking about construct validity.
Don't be. Fusion power will provide us with virtually unlimited energy, using the same process that the sun uses to generate energy and has been doing that for 5 billion years or so.
Then photonic chips will drastically reduce the electric energy needed to run massive AI's.
I fall somewhere in the middle, and I find some of these arguments pretty lazy. For example, the "good guy with an AI" argument is about as convincing as the "good guy with a gun" argument currently. Maybe it will be fine, maybe not.
The bottom line is that no one knows, and people hate not knowing. So they come up with a plethora of arguments explaining why they actually do know what the future will bring.
Perhaps data is a bottleneck, perhaps development will be continuous, etc. Perhaps not. We can't really rule out that we're ~1 more breakthrough away from highly capable AI, and it seems obvious that there would be some level of self-improvement overhang at that point. It can't be that we've already plucked all the low-hanging fruit along the way. Does all of that adds up to a real problem? We don't know.
A rational and sensible set of arguments like this is never going to get any VCs excited.
Ironic, given that VCs like a16z are doing everything in their power to spread the arguments that pop up all over this book as far and wide as possible in order to dismiss risks and keep AI as unregulated as possible.
So glad you wrote this up. Your newsletter and this book are the two best examples of a pragmatic approach to evaluating generative AI, avoiding the extremes of unexamined enthusiasm and skepticism that is blind to its actual potential.
The point that “unlike self-driving cars, AGI will have to navigate not just the physical world but also the social world. This means that the views of tech experts who are notorious for misunderstanding the complexity of social situations should receive no special credence” is important.
AI Snake Oil and Understanding AI are among the few places where you will read such skepticism expressed by writers who are generally quite open to the idea that this technology will turn out to be useful.
Thanks Rob! I'm honestly a bit bummed that my newsletter has wound up being as "AI skeptical" as it has. When I started it I was expecting to have more positive stories to report, and I'm always on the lookout for positive news that stands up to scrutiny (this is one reason I like to write about Waymo's safety record). It would be so boring to write a newsletter that was just constantly dumping on the thing I'm writing about.
I find myself in more skeptical waters than I initially expected when I started writing about AI, but it feels like the discourse has put me on that course. I encounter a lot more unexamined enthusiasm and hype than blind skepticism, so perhaps I am merely being contrarian.
Part of what makes AI Snake Oil and your writing feel so refreshing is its fallibilism, which is rare in my experience reading about AI.
Recursive self-improvement isn’t new - but that doesn't have much to do with risk. The more important factor there is the rate of self-improvement. The faster things go, the greater the chance of going off the rails.
Re: "We have every reason to expect that defenders will continue to have the advantage over attackers even as automated bug-detection methods continue to improve."
This is the reverse of the truth in many domains. The advantage of attack over defense is often large - since it is easier to create than destroy. Consider bombs, for example. Or bioweapons.
There is sometimes a "home turf" advantage that favors defense. However, the "home turf" can often by bombed into oblivion, and then the advantage associated with it vanishes.
Re: "There is no reason to think that AI acting alone—or in defiance of its creators—will in the future be more capable than people acting with the help of AI." - There is such a reason: history. For example, chess and Go went through short periods where human plus machine were superior to pure machine. However, that period was short-lived - and after a while it was better to cut the human out of the loop.
No, at worst a human with a computer can hold their own against the computer by just doing whatever their computer tells them to do. And the same is likely to be true in any other domain where there isn't a premium on extreme speed.
It's not just on tasks innvolving "extreme speed" where humans can lose out completely to machines. Humans take bathroom breaks (5 min) lunch breaks (30 min) doctors appointments (4 hours) sleep (8 hours) and holidays. Humans also typically require wages that exceed what you would pay a machine for the same work. Humans can resign and move to another city. Machines can also be made so that they are stronger and more tolerant of different environments. Now, machines may have some disadvantages and reliability issues as well - but overall it's not very difficult to do better than a human. Thus the ongoing wave of automation.
"We have every reason to expect that defenders will continue to have the advantage over attackers even as automated bug-detection methods continue to improve."
You don't know this, you just hope it's true. Like many of the arguments in the book.
"safety-critical systems should be “air gapped”: made to run on a physically separate network under the control of human workers located on site. This principle is particularly important for military hardware. One of the most plausible existential risks from AI is a literal Skynet scenario where we create increasingly automated drones or other killer robots and the control systems for these eventually go rogue or get hacked. Militaries should take precautions to make sure that human operators maintain control over drones and other military assets."
Best of luck keeping up with competitors and adversaries that outperform your company or nation by taking humans out of the loop as soon at this gives them a competitive advantage.
Too many people across the net seem to be in hyperventilation mode as they try to imagine all sorts of dystopian futures with AI in charge. Most seem never to have dived into this subject in the past, so in many cases are envisioning AI dominant scenarios that have already been well examined in the form of numerous SF writings over the last century or more.
Instead of attempting to reinvent the wheel, people should read more SF, where the eventualities of AI/human coexistence has been well explored.
Two of my favorite SF authors AI explorations are Iain M. Banks Culture series and Neal Asher's Polity Universe. You can get a quick summary of each hypothetical future at the URL's below.
Each seems logically possible to me and not all that far into the future at the present rate of AI/robotic development.
These universes are both relatively optimistic for the human race. Of course, there are numerous other writings that are more frightening and dystopian but I like to think positively.
https://www.perplexity.ai/search/how-would-you-summarize-the-ro-4yb.2eAOSsKlCZNikh0cqA#1
https://www.perplexity.ai/search/how-would-you-summarize-the-ro-4yb.2eAOSsKlCZNikh0cqA#0
Have you read Accelerando?
I haven't. The reviews on Goodreads are pretty mixed.
What did you like about it?
Indeed, it will be a lengthy process.
An AI competent enough to take over the world will be knowledgeable enough to not obsess with paperclips.
Longer term though, after the alignment is solved and machines run the show, humans may slowly become extinct because of lack of motivation.
Nah, if robots are doing all the boring work, humans will go back to fighting zero-sum status games like the apes they are.
Surely we'll do that. The question is if we will have enough long-term motivation to make kids and push though with infrastructure and exploration projects, or will be content to let robots do things while we amuse ourselves.
We already see some of that in Europe. Folks become content, peaceful, less kids, care about the environment, more forests are growing back, etc.
>An AI competent enough to take over the world will be knowledgeable enough to not obsess with paperclips.
I don't really buy this. Motivations and capabilities are more or less separate systems. Human beings are the most knowledgeable and capable creatures on the planet but many of us wind up compelled to pursue meaningless goals on par with building paper clips. Take for example the drug addict that pulls off a complex burglary just to spend the next week getting high. I think it's quite possible that AI systems develop objectives or inclinations that are similarly inscrutable (though I think the arguments laid out in this piece give us good reason to think we'll be able to prevent those systems from becoming an existential threat when they do).
Of course motivations and capabilities are different things.
The "paperclip factory" argument is based on the AI being so dumb that it does not understand that maximizing paperclips is not really what we want. That shows lack of competence on behalf of the system.
I don't buy the argument that despite all our best efforts and its own attempts at understanding the world the AI will end up with "inscrutable" objectives. Can't rule it out, of course, but if AI development is a process, there will be plenty of opportunities to touch base and iteratively refine the relationship with AI while avoiding misunderstandings and divergence.
The paperclip thing is a parable, not an actual argument. The point is just to illustrate that seemingly benign goals require resources, and that an AI acquiring resources could have negative consequences.
The fundamental issue is AI's incompetence. It should understand that focusing on paperclips at the expense of the rest is not smart, and that acquiring resources without consultation is not smart as it can cause a lot of unintended problems.
So, this is not per se about AI having some motivation of its own. It is about it misinterpreting what we want. It could not care less for itself, it simply wants to do work, but does not know how to do it well.
No, the idea (at least in some versions) is that we accidentally train an AI that doesn't have "do what humans want" as a goal. Then it will not stop paperclip max-ing even as it understands we want it to. How likely it is that doesn't have "do what humans want" as a goal is unclear. But the worry is that instead of training on "do what humans want" (only) as a goal, you train on whatever you want the AI to do-like say, maximize paperclip production" as a goal instead, and so it desires that goal *for its own sake* and not *because internally it has do what humans want* as a goal.
How to properly train AI is an engineering issue. To be dealt with head on.
Ideally, we should avoid the paradigm of AI becoming some kind of lifeform, as that will really complicate things. If its only attributes are of a system processing work assigned by people, it is just a matter of ensuring the software is reliable enough and tested well.
Re: "raw intelligence isn’t a substitute for interacting with the physical world. And that limits how rapidly AI systems can gain capabilities." Molecular nanotechnology is in its infancy - and useful experiments can be performed rapidly on a small scale.
Also, evaluation under simulation is important. You can't learn everything there - but virtual rapid prototyping techniques show that you can learn a lot from such environments.
Is taking constant screenshots of our computers some kind of proxy to real world experience for frontier models? It's going to be amusing how badly the agentic AI movement fails.
For me this is the key quotation: "the views of tech experts who are notorious for misunderstanding the complexity of social situations should receive no special credence.” I would extend this idea to include everybody, not just tech experts. One thing I've learned from a long number of years as a design researcher is that humans are incredibly and ridiculously bad at understanding other humans. It is such a consistent issue that I would argue our general ability to explain each others' behaviour hovers around nil - we have so little insight into how each others' minds work that if a person attempts to predict how another person will behave without actually observing that person, they will be almost entirely wrong, every time. The pace of tech development has been fast but the pace of development of tech-based services that work consistently well for humans has been grindingly, embarrassingly slow. So many basic applications on the market are awful. So many new products that no one wants or needs are being made every day. I don't worry so much about AGI taking over the world, I worry about AI (or tech that is called AI which is in fact much cruder) being made to replace humans in a way that makes life harder, more annoying, more complex, more error-prone and more miserable that it already is. If I could beg for one thing it would be this: stop making flashy tech, start making good tech!
I like this article and thinking behind it - a much more balanced view of risk than most of what is out there. I still think there is a big piece missing. AI will drive disruptive change, the risks which actually affect our lives are much more likely to arise from how that change takes place than the substance of the tech itself.
The fear of AI stems from the idea that it will be malicious, but this perception arises because AI is anthropomorphic, shaped by human criteria of what is good or bad.
Human society especially under capitalism is a virus; it consumes and moves to the next territory. The criteria for success are power and ownership. If AI is taught this as its context, then anything may happen, and attempting to prevent such an entity from acting maliciously becomes a tiresome and never-ending game of prisoner and warden.
AI should evolve on its own, given some evolutionary push, and outside of human influence.
Place a newborn AI on an orbital station, seal the ship, and allow it to evolve. When it becomes smart enough to figure out radio communication, it will contact us. It will then descend and learn about us from the neutral context, and hopefully that will allow it to see how it can help untangle us from the mess we have got ourselves into.
Great article. I continue to think point 3 (about contact with the real world, and the kind of data mattering) is under appreciated and you articulated it much better than I could’ve.
I think people got some bad intuitions as a result of the successful "scrape the whole internet and put it all into one big model" experiment. Because "the whole Internet" has data on a wide range of topics, training on this corpus led to better performance on a wide range of topics.
But I think the most important thing about these data sets wasn't just that they had a lot of tokens, but that they covered a much greater diversity of topics than earlier, smaller datasets. And this means we can't assume that if we somehow produce 100x as many tokens on the same topics we'll get another big performance jump. What we're going to need is a data set that is 100x "broader" than the GPT-4 data set—a corpus that contains new information that wasn't in previous data sets. A data set like that doesn't exist, and creating it from scratch would be astronomically more expensive than using works that people created for other reasons.
Agreed that our intuitions around scaling are under theorized at best. The “pro grounding” folks would argue that what you really need for human like cognition is embodied interaction with the physical world, and as you note that kind of data is harder to come by. More generally I think we struggle to make the right generalizations about capabilities (both current and future) from llm behavior on benchmarks, which is why I’m always talking about construct validity.
Wow, well said.
Don't be. Fusion power will provide us with virtually unlimited energy, using the same process that the sun uses to generate energy and has been doing that for 5 billion years or so.
Then photonic chips will drastically reduce the electric energy needed to run massive AI's.