Six principles for thinking about AI risk

Timothy B. Lee

Oct 24, 2024

127

Two Princeton computer scientists argue AI is unlikely to pose an existential risk.

Read →

39 Comments

Greg G

Oct 25, 2024

I fall somewhere in the middle, and I find some of these arguments pretty lazy. For example, the "good guy with an AI" argument is about as convincing as the "good guy with a gun" argument currently. Maybe it will be fine, maybe not.

The bottom line is that no one knows, and people hate not knowing. So they come up with a plethora of arguments explaining why they actually do know what the future will bring.

Perhaps data is a bottleneck, perhaps development will be continuous, etc. Perhaps not. We can't really rule out that we're ~1 more breakthrough away from highly capable AI, and it seems obvious that there would be some level of self-improvement overhang at that point. It can't be that we've already plucked all the low-hanging fruit along the way. Does all of that adds up to a real problem? We don't know.

Expand full comment

Reply (1)

B B

Jun 27

I agree with your gun analogy.

More precisely, self-driving cars have proven difficult because they are operating under regulated goals and have to be perfect. If some government or entity decides to use AI to create destructive weapons, they neither have to worry about regulation or being perfect.

It is often easier to destroy than build.

Expand full comment

Rachel Maron

Apr 21

While AI Snake Oil by Narayanan and Kapoor rightfully dismantles doomer fantasies about rogue AGI, it misses the deeper, more insidious existential risk: the erosion of trust. The real threat isn’t that AI will become self-aware and physically destroy us, it’s that flawed, opaque, and biased systems will quietly dismantle our ability to tell what’s real, whom to trust, and what to believe.

Existential collapse doesn’t look like a robot uprising; it looks like manipulated elections, automated discrimination, and the death of shared reality. The Princeton School frames AI risk as a matter of technical safety and infrastructure regulation, but this ignores how AI is already being weaponized socially, against marginalized communities, democratic institutions, and cognitive coherence itself.

I would argue that the most dangerous AI isn’t smarter than us; it’s trusted more than it should be. If trust is offloaded to systems built without transparency, accountability, or equity, we don’t get safety. We get soft authoritarianism wrapped in efficiency metrics.

The future of AI governance isn’t about containment; it’s about trust reconstruction. Because once public trust collapses, no amount of air-gapping will save us.

Expand full comment

Oct 25, 2024

A rational and sensible set of arguments like this is never going to get any VCs excited.

Expand full comment

Reply (1)

Jelle Donders

Oct 29, 2024

Ironic, given that VCs like a16z are doing everything in their power to spread the arguments that pop up all over this book as far and wide as possible in order to dismiss risks and keep AI as unregulated as possible.

Expand full comment

Rob Nelson

Oct 24, 2024

So glad you wrote this up. Your newsletter and this book are the two best examples of a pragmatic approach to evaluating generative AI, avoiding the extremes of unexamined enthusiasm and skepticism that is blind to its actual potential.

The point that “unlike self-driving cars, AGI will have to navigate not just the physical world but also the social world. This means that the views of tech experts who are notorious for misunderstanding the complexity of social situations should receive no special credence” is important.

AI Snake Oil and Understanding AI are among the few places where you will read such skepticism expressed by writers who are generally quite open to the idea that this technology will turn out to be useful.

Expand full comment

Reply (1)

Timothy B. Lee

Oct 24, 2024Edited

Thanks Rob! I'm honestly a bit bummed that my newsletter has wound up being as "AI skeptical" as it has. When I started it I was expecting to have more positive stories to report, and I'm always on the lookout for positive news that stands up to scrutiny (this is one reason I like to write about Waymo's safety record). It would be so boring to write a newsletter that was just constantly dumping on the thing I'm writing about.

Expand full comment

Reply (1)

Rob Nelson

Oct 24, 2024

I find myself in more skeptical waters than I initially expected when I started writing about AI, but it feels like the discourse has put me on that course. I encounter a lot more unexamined enthusiasm and hype than blind skepticism, so perhaps I am merely being contrarian.

Part of what makes AI Snake Oil and your writing feel so refreshing is its fallibilism, which is rare in my experience reading about AI.

Expand full comment

Tim Tyler

Nov 12, 2024

Recursive self-improvement isn’t new - but that doesn't have much to do with risk. The more important factor there is the rate of self-improvement. The faster things go, the greater the chance of going off the rails.

Expand full comment

Tim Tyler

Nov 12, 2024Edited

Re: "We have every reason to expect that defenders will continue to have the advantage over attackers even as automated bug-detection methods continue to improve."

This is the reverse of the truth in many domains. The advantage of attack over defense is often large - since it is easier to create than destroy. Consider bombs, for example. Or bioweapons.

There is sometimes a "home turf" advantage that favors defense. However, the "home turf" can often by bombed into oblivion, and then the advantage associated with it vanishes.

Expand full comment

Tim Tyler

Nov 12, 2024

Re: "There is no reason to think that AI acting alone—or in defiance of its creators—will in the future be more capable than people acting with the help of AI." - There is such a reason: history. For example, chess and Go went through short periods where human plus machine were superior to pure machine. However, that period was short-lived - and after a while it was better to cut the human out of the loop.

Expand full comment

Reply (1)

Timothy B. Lee

Nov 12, 2024

No, at worst a human with a computer can hold their own against the computer by just doing whatever their computer tells them to do. And the same is likely to be true in any other domain where there isn't a premium on extreme speed.

Expand full comment

Reply (1)

Tim Tyler

Nov 13, 2024Edited

It's not just on tasks innvolving "extreme speed" where humans can lose out completely to machines. Humans take bathroom breaks (5 min) lunch breaks (30 min) doctors appointments (4 hours) sleep (8 hours) and holidays. Humans also typically require wages that exceed what you would pay a machine for the same work. Humans can resign and move to another city. Machines can also be made so that they are stronger and more tolerant of different environments. Now, machines may have some disadvantages and reliability issues as well - but overall it's not very difficult to do better than a human. Thus the ongoing wave of automation.

Expand full comment

Jelle Donders

Oct 29, 2024

"We have every reason to expect that defenders will continue to have the advantage over attackers even as automated bug-detection methods continue to improve."

You don't know this, you just hope it's true. Like many of the arguments in the book.

"safety-critical systems should be “air gapped”: made to run on a physically separate network under the control of human workers located on site. This principle is particularly important for military hardware. One of the most plausible existential risks from AI is a literal Skynet scenario where we create increasingly automated drones or other killer robots and the control systems for these eventually go rogue or get hacked. Militaries should take precautions to make sure that human operators maintain control over drones and other military assets."

Best of luck keeping up with competitors and adversaries that outperform your company or nation by taking humans out of the loop as soon at this gives them a competitive advantage.

Expand full comment

Kenny Fraser

Oct 27, 2024

I like this article and thinking behind it - a much more balanced view of risk than most of what is out there. I still think there is a big piece missing. AI will drive disruptive change, the risks which actually affect our lives are much more likely to arise from how that change takes place than the substance of the tech itself.

Expand full comment

Jojo

Oct 25, 2024

Too many people across the net seem to be in hyperventilation mode as they try to imagine all sorts of dystopian futures with AI in charge. Most seem never to have dived into this subject in the past, so in many cases are envisioning AI dominant scenarios that have already been well examined in the form of numerous SF writings over the last century or more.

Instead of attempting to reinvent the wheel, people should read more SF, where the eventualities of AI/human coexistence has been well explored.

Two of my favorite SF authors AI explorations are Iain M. Banks Culture series and Neal Asher's Polity Universe. You can get a quick summary of each hypothetical future at the URL's below.

Each seems logically possible to me and not all that far into the future at the present rate of AI/robotic development.

These universes are both relatively optimistic for the human race. Of course, there are numerous other writings that are more frightening and dystopian but I like to think positively.

https://www.perplexity.ai/search/how-would-you-summarize-the-ro-4yb.2eAOSsKlCZNikh0cqA#1

https://www.perplexity.ai/search/how-would-you-summarize-the-ro-4yb.2eAOSsKlCZNikh0cqA#0

Expand full comment

Reply (1)

Greg G

Oct 25, 2024

Have you read Accelerando?

Expand full comment

Reply (1)

Jojo

Oct 27, 2024

I haven't. The reviews on Goodreads are pretty mixed.

What did you like about it?

Expand full comment

Andy X Andersen

Oct 24, 2024

Indeed, it will be a lengthy process.

An AI competent enough to take over the world will be knowledgeable enough to not obsess with paperclips.

Longer term though, after the alignment is solved and machines run the show, humans may slowly become extinct because of lack of motivation.

Expand full comment

Reply (2)

Timothy B. Lee

Oct 24, 2024

Nah, if robots are doing all the boring work, humans will go back to fighting zero-sum status games like the apes they are.

Expand full comment

Reply (1)

Andy X Andersen

Oct 24, 2024Edited

Surely we'll do that. The question is if we will have enough long-term motivation to make kids and push though with infrastructure and exploration projects, or will be content to let robots do things while we amuse ourselves.

We already see some of that in Europe. Folks become content, peaceful, less kids, care about the environment, more forests are growing back, etc.

Expand full comment

Derek Tank

Oct 24, 2024

>An AI competent enough to take over the world will be knowledgeable enough to not obsess with paperclips.

I don't really buy this. Motivations and capabilities are more or less separate systems. Human beings are the most knowledgeable and capable creatures on the planet but many of us wind up compelled to pursue meaningless goals on par with building paper clips. Take for example the drug addict that pulls off a complex burglary just to spend the next week getting high. I think it's quite possible that AI systems develop objectives or inclinations that are similarly inscrutable (though I think the arguments laid out in this piece give us good reason to think we'll be able to prevent those systems from becoming an existential threat when they do).

Expand full comment

Reply (1)

Andy X Andersen

Oct 24, 2024

Of course motivations and capabilities are different things.

The "paperclip factory" argument is based on the AI being so dumb that it does not understand that maximizing paperclips is not really what we want. That shows lack of competence on behalf of the system.

I don't buy the argument that despite all our best efforts and its own attempts at understanding the world the AI will end up with "inscrutable" objectives. Can't rule it out, of course, but if AI development is a process, there will be plenty of opportunities to touch base and iteratively refine the relationship with AI while avoiding misunderstandings and divergence.

Expand full comment

Reply (3)

Greg G

Oct 25, 2024

The paperclip thing is a parable, not an actual argument. The point is just to illustrate that seemingly benign goals require resources, and that an AI acquiring resources could have negative consequences.

Expand full comment

Reply (1)

Andy X Andersen

Oct 26, 2024

The fundamental issue is AI's incompetence. It should understand that focusing on paperclips at the expense of the rest is not smart, and that acquiring resources without consultation is not smart as it can cause a lot of unintended problems.

So, this is not per se about AI having some motivation of its own. It is about it misinterpreting what we want. It could not care less for itself, it simply wants to do work, but does not know how to do it well.

Expand full comment

DABM

Oct 27, 2024

No, the idea (at least in some versions) is that we accidentally train an AI that doesn't have "do what humans want" as a goal. Then it will not stop paperclip max-ing even as it understands we want it to. How likely it is that doesn't have "do what humans want" as a goal is unclear. But the worry is that instead of training on "do what humans want" (only) as a goal, you train on whatever you want the AI to do-like say, maximize paperclip production" as a goal instead, and so it desires that goal *for its own sake* and not *because internally it has do what humans want* as a goal.

Expand full comment

Reply (1)

Andy X Andersen

Oct 27, 2024

How to properly train AI is an engineering issue. To be dealt with head on.

Expand full comment

Andy X Andersen

Oct 24, 2024Edited

Ideally, we should avoid the paradigm of AI becoming some kind of lifeform, as that will really complicate things. If its only attributes are of a system processing work assigned by people, it is just a matter of ensuring the software is reliable enough and tested well.

Expand full comment

Connie Yi

Feb 10

I'm more afraid of humanity's reaction to the availability of AI. If we let AI do simple things, then we stop doing and, by extension, stop learning. We learn through repetition, and applying learning to tasks. If we cannot learn, humanity will not exist to break through monotony, and AI will collapse along with that.

Expand full comment

Tom Rearick

Jan 15

Honestly, I don't think you need a book to understand AI risk.

1. AI is like a really dumb person

I discuss the limitations of AI at Intelligence Evolved (https://tomrearick.substack.com) so I won't repeat them here. AI is fantastic at retrieving what somebody else has programmed into it. It is not good at reasoning or solving original problems. It will make a mistake, acknowledge that it made a mistake, and then repeat the same mistake (see https://tomrearick.substack.com/p/funny-flubs).

2. Giving unlimited power to a stupid or evil actor is a bad idea

Churchill said that democracy is the worst form of government except for every other form. Dictatorships and autocracies are worse than democracies because they lack checks or balances. If you would not make an idiot or a psychopath a king, then you should not put an AI in absolute control. AI is not the problem. Magnifying stupid and evil is.

3. Networks scale the damage potential of computers and AI

Humans screw up. To really screw things up, you need a computer. To scale from a class A screwup to an apocalyptic disaster, you need to connect computers to the Internet. This is exactly what we are doing with AI. Networks give AI enormous power.

More on this at https://tomrearick.substack.com/p/understanding-ai-risk

Expand full comment

Tim Tyler

Nov 12, 2024

Re: "raw intelligence isn’t a substitute for interacting with the physical world. And that limits how rapidly AI systems can gain capabilities." Molecular nanotechnology is in its infancy - and useful experiments can be performed rapidly on a small scale.

Also, evaluation under simulation is important. You can't learn everything there - but virtual rapid prototyping techniques show that you can learn a lot from such environments.

Expand full comment

Michael Spencer

Oct 28, 2024

Is taking constant screenshots of our computers some kind of proxy to real world experience for frontier models? It's going to be amusing how badly the agentic AI movement fails.

Expand full comment

Frances Brown

Oct 28, 2024

For me this is the key quotation: "the views of tech experts who are notorious for misunderstanding the complexity of social situations should receive no special credence.” I would extend this idea to include everybody, not just tech experts. One thing I've learned from a long number of years as a design researcher is that humans are incredibly and ridiculously bad at understanding other humans. It is such a consistent issue that I would argue our general ability to explain each others' behaviour hovers around nil - we have so little insight into how each others' minds work that if a person attempts to predict how another person will behave without actually observing that person, they will be almost entirely wrong, every time. The pace of tech development has been fast but the pace of development of tech-based services that work consistently well for humans has been grindingly, embarrassingly slow. So many basic applications on the market are awful. So many new products that no one wants or needs are being made every day. I don't worry so much about AGI taking over the world, I worry about AI (or tech that is called AI which is in fact much cruder) being made to replace humans in a way that makes life harder, more annoying, more complex, more error-prone and more miserable that it already is. If I could beg for one thing it would be this: stop making flashy tech, start making good tech!

Expand full comment