Why I'm not afraid of superintelligent AI taking over the world
The world isn't a giant chess game.
I’m a journalist with a computer science master’s degree. In the past I’ve written for the Washington Post, Ars Technica, and other publications.
Understanding AI explores how AI works and how it’s changing the world. If you like this article, please sign up to have future articles sent straight to your inbox.
In 1997, an IBM computer called Deep Blue stunned the world by defeating the world chess champion, Garry Kasparov. Then chess software kept getting better. Today, the highest-ranked chess software, Stockfish, has an Elo rating of 3545, while the top human player, Magnus Carlsen, is rated at 2839. That gap implies Carlson would lose the overwhelming majority of games he played against Stockfish.
The defeat and subsequent humiliation of human chess players by computer programs made a big impression on people thinking about the future of artificial intelligence. Chess featured prominently in philosopher Nick Bostrom’s 2014 book, Superintelligence.
“One might have thought that great chess playing requires being able to learn abstract concepts, think cleverly about strategy, compose flexible plans, make a wide range of ingenious logical deductions, and maybe even model one’s opponent’s thinking,” Bostrom wrote. “Not so. It turned out to be possible to build a perfectly fine chess engine around a special-purpose algorithm.”
“It is tempting to speculate that other capabilities—such as general reasoning ability, or some key ability involved in programming—might likewise be achievable through some surprisingly simple algorithm,” Bostrom added.
Bostrom’s thesis was that humanity’s defeat at chess was just the beginning. He predicted that in the decades to come, computers would become better than humans at almost every cognitive task. And not just a little bit better. As with chess, Bostrom thought computer capabilities would soar past those of humans on a wide range of tasks from scientific discovery to military conquest. Bostrom warned that advanced AI could escape from the lab and literally take over the world.
Bostrom’s book was hugely influential. Elon Musk praised it shortly after its publication, describing AI as “potentially more dangerous than nukes.” In early 2015, Sam Altman described the book as “the best thing I’ve seen” on superhuman machine intelligence, which Altman considered “the greatest threat to the continued existence of humanity.” Later that year, Musk and Altman teamed up with several others to create OpenAI in hopes of steering AI technology in a more benign direction.
Bostrom acolytes scored a victory last month when the Biden Administration issued a sweeping executive order on artificial intelligence. Among other things, the order requires the creators of large AI models to report on the results of red-team safety testing, including their potential to evade human control through “deception or obfuscation.”
I did not find Bostrom’s arguments convincing back in 2015, and I still don’t. I think the emergence of superhuman chess software has led a lot of people astray. Chess is a game of perfect information and simple, deterministic rules. This means that it’s always possible to make chess software more powerful with better algorithms and more computing power.
But most important cognitive tasks—from translating a document to inventing a new technology to running an organization—don’t work like this. For these tasks, knowledge is at least as important as computing power. And because knowledge isn’t fungible, we’re not going to get a single super-AI that’s better than humans at everything. Instead, we’ll get a lot of different AIs with different strengths and weaknesses, none of which will be able to take over the world.
How computers surpassed humans at chess
It was December 2001 and I was a computer science major at the University of Minnesota. I had enrolled in an AI class and my three-person team was presenting our final project: a Java program that played chess.
We hooked my laptop up to the projector and let our professor play against our software while we did our presentation. But it soon became clear that people were more interested in the game than our slides. People started shouting out suggestions for the professor’s next move. Eventually we stopped talking and stood watching as our program backed his king into a corner and checkmated him.
“You get an A,” the professor said with a chuckle as we headed back to our seats.
Our victorious chess engine was based on a fairly simple algorithm:
Make a list of all possible four-move sequences.
Score each resulting board position, where a higher score reflects a better position for white.
Working backwards, find the strongest move at each board position—first three moves out, then two, then one—assuming that white will choose the move that yields the highest score, while black will do the opposite.
Today’s best chess engines are far more sophisticated than this. My team’s software scored positions using a simple piece-counting rule (queen=9 points, rook=5 points, etc). Today’s best chess engines use sophisticated neural networks to evaluate board positions.
Modern chess engines can also look much farther into the future. This is partly because increased computing power allows them to evaluate more positions. It’s also because sophisticated algorithms allow chess engines to focus on the most “likely” move sequences. As a result, modern chess engines can “look ahead” dozens of moves—something no human player can do.
The worry for singularists like Bostrom is that a lot of other cognitive tasks will turn out like chess: after computers reach parity with human beings, they’ll continue getting better and soon humans will be far behind. The rapid progress of large language models over the last five years has intensified those concerns.
Scaling language models
In 2020, OpenAI published a landmark paper showing a strong relationship between the size of a transformer-based language model (like GPT-3) and its performance. The researchers found that larger models—those with more trainable parameters—were better able to predict the next word in a document. For a given model size, “architectural details such as network width or depth have minimal effects within a wide range.”
Larger models require more computing power to train, so the paper helped to set off a computing arms race that continues to this day. As language models got bigger, they seemed to be gaining new human-like abilities.
But the “scaling laws” OpenAI articulated in 2020 came with an important caveat: larger models need more training data. This point was underscored by a widely read 2022 paper from Google’s DeepMind, which found that “for every doubling of model size the number of training tokens should also be doubled.”
The numbers are mind-boggling. DeepMind built a model called Chinchilla and trained it with 1.4 trillion tokens. That huge training set enabled Chinchilla to outperform OpenAI’s GPT-3 (which was trained for 300 billion tokens) despite the fact that GPT-3 had more than twice as many parameters (175 billion) as Chinchilla (70 billion).
In short, to make powerful language models, you need a lot of computing power and you also need a lot of data. Which raises an important question: could we run out of training data?
Last year a nonprofit called Epoch AI published a paper estimating that roughly 9 trillion words of high-quality data were available for training. If large language models continue growing at recent rates, they’ll bump up against this limit around 2026.
It’s not clear how binding this constraint will ultimately be. One recent paper found that it’s possible to reuse the same text up to four times with little loss of performance. It’s also possible we’ll find ways to automate the creation of training data—though early efforts in this direction haven’t gone well.
But at a minimum, the need for training data complicates naive assumptions that Moore’s law will inevitably push LLMs to human capabilities and beyond.
Knowledge, not data
So far I’ve been talking about data because that’s what most empirical research has focused on. But what ultimately matters isn’t the number of tokens in a training set, it’s the amount of useful information—knowledge—contained in training data.
I think it’s helpful here to draw an analogy to the learning process of human beings. Imagine a smart college freshman taking an introductory physics course. She may have a lot of what psychologists call fluid intelligence—a general capacity to learn new information and solve novel problems. But she won’t initially be very good at solving physics problems—which psychologists call crystalized intelligence. Her physics acumen will develop over time as she does physics coursework.
If she goes on to grad school the nature of her learning will change. She’ll transition from mostly learning about the discoveries of other physicists to trying to make discoveries of her own. Her understanding of the laws of physics will continue to advance, but much more slowly than when she was speed-running through the discoveries of Newton, Maxwell, Einstein and Bohr.
You could say this slowdown would occur because she “ran out of training data,” but that’s not quite right. There would still be plenty of physics textbooks she hasn’t read, but they’d largely cover material she already knows.
An LLM in training is a bit like a student with a very low fluid intelligence. An LLM often needs to read thousands of documents on a particular topic to start understanding it. Data scientists compensate for the poor learning capacity of LLMs by using enormous training sets that repeat important facts and concepts over and over.
I expect that in the coming years we’ll develop language models with higher fluid intelligence—models that can glean more information from fewer documents, the way people do. If that happens, the apparent shortage of training data could prove illusory.
But I think it’s a mistake to assume that removing the data bottleneck in this way would clear the way for models to zoom past humans and achieve superintelligence. Think again about our hypothetical graduate student. Let’s say that she was able to reach the frontiers of physics knowledge after reading 20 textbooks. Could she have achieved a superhuman understanding of physics by reading 200 textbooks? Obviously not. Those extra 180 textbooks contain a lot of words, they don’t contain very much knowledge she doesn’t already have.
So too with AI systems. I suspect that on many tasks, their performance will start to plateau around human-level performance. Not because they “run out of data,” but because they reached the frontiers of human knowledge.
Thomas Edison’s insight
The simplicity and predictability of chess allow computers to “look ahead” and anticipate the likely consequences of any potential move. Most real-world problems are not like that.
For example, there’s a famous military saying that “no plan survives contact with the enemy.” The world is complex, and military planners are invariably working with incomplete and inaccurate information. When a battle begins, they inevitably discover that some of their assumptions were wrong and the battle plays out in ways they didn’t anticipate.
Thomas Edison had a saying that expresses a similar idea: “genius is one percent inspiration and 99 percent perspiration.” Edison experimented with 1,600 different materials to find a good material for the filament in his most famous invention, the electric light bulb.
“I never had an idea in my life,” Edison once said. “My so-called inventions already existed in the environment—I took them out. I’ve created nothing. Nobody does. There’s no such thing as an idea being brain-born; everything comes from the outside.”
A similar view is widely held in today’s Silicon Valley startup scene. In his influential 2011 book The Lean Startup, Eric Ries urged entrepreneurs to quickly build a “minimum viable product” so they could start getting feedback from real customers. It’s easy to think of products customers might want, but the only way to find out if they actually want it is to build it and try to sell it.
These are not interactions that can be “gamed out” digitally the way a chess engine analyzes possible chess moves. They require interacting with the real world: launching military campaigns, building and testing prototypes, putting products in the hands of real customers. This kind of trial and error is slow and usually requires a lot of help from other people.
Singularists tend to neglect the importance of trial and error. Bostrom, for example, writes that a superintelligent AI system might use nanotechnology to take over the world. Specifically, he envisions it producing “a detailed blueprint for how to bootstrap from existing technology (such as biotechnology and protein engineering) to the constructor capabilities needed for high-throughput atomically precise manufacturing that would allow inexpensive fabrication of a much wider range of nanomechanical structures.”
Next he envisions the AI covertly manufacturing a huge number of nano-robots that could spread across the globe. He predicts that “at a pre-set time, nanofactories producing nerve gas or target-seeking mosquito-like robots might then burgeon forth simultaneously from every square meter of the globe.”
It is not clear to me if this type of nanotechnology is even physically possible. But assuming it is, Bostrom seems naive about the amount of work it takes to go from concept to execution. Not only would this kind of invention require years of painstaking trial and error in the physical world, I expect it would require complex and expensive machinery like the astronomically expensive machines used to manufacture today’s most advanced silicon chips.
I asked Bostrom about this in an October interview. He acknowledged it was a challenge, but argued that a superintelligent AI might be able to come up with designs that don’t require much trial and error:
If you are trying to build the molecular structure, maybe one particular structure depends on various empirical parameters, you have to craft it just right and measure it precisely. If you're smarter, you'll see there's a huge design space, and pick another structure that is more robust to uncertainty, that you can be quite confident would work. It’s a little bit like if you design a lego set or something like that, it kind of clicks into place, you might not have to measure after you put the right lego block in place.
I don’t doubt that future AI systems could accelerate progress in nanotechnology by suggesting novel approaches. But in Edison’s estimation, a good idea only gets you about 1 percent of the way to a useful invention. The other 99 percent of the work—the “perspiration”—is going to require a lot of human help. And people aren’t going to want to help out unless they understand what they’re being asked to do and why.
Two claims about superintelligence
It’s helpful to distinguish between two different claims that fall under the broad heading of superintelligence:
The weak superintelligence thesis says that due to Moore’s Law, AI systems will eventually achieve superhuman performance on most or all cognitive tasks.
The strong superintelligence thesis says that a single AI system will master a wide range of tasks—and do it so rapidly that it is able to take over the world.
Last week I shared an early draft of this essay with my friend Sam Hammond, who believes that artificial general intelligence is closer than you might think. In his response, Sam pressed me to take the weak superintelligence thesis seriously, asking "what special sauce do humans have for adding value to scientific and technological progress that cannot be recreated by machine intelligence?"
On a fundamental level, I don’t think humans have any “special sauce.” I think it’s entirely possible that AI systems will eventually surpass humans at almost all cognitive tasks. But it matters how—and how quickly—this happens.
If intelligence is mostly a matter of computing power, that points toward a winner-take-all competition. The smartest AI will be able to generate huge profits, which it can plow into buying more GPUs so it can become even more intelligent. In this “fast takeoff” scenario, high intelligence would enable a single powerful AI system to prevail in everything from technological innovation to military strategy.
But things look different if you assume that knowledge, rather than computing power, is the main bottleneck to increasing intelligence. Computing power is fungible; knowledge is not. Extensive knowledge about French literature, for example, won’t help an AI system design better rockets or computer chips.
So if knowledge is the essential input for intelligence, power is likely to remain widely distributed. Rocket companies know a lot about rocket design, so the best AI for designing rockets is likely to be made by (or in partnership with) rocket companies. Chip companies know a lot about chip design, so the best AI for designing computer chips is likely to be made by (or in partnership with) chip companies.
Importantly, the world’s militaries and defense contractors know a lot about manufacturing weapons—and are unlikely to share their knowledge with a rogue AI looking to build a robot army.
Of course some organizations will fail to adapt and may be displaced by new “AI native” organizations, just as companies like Kodak and Blockbuster were driven into bankruptcy by earlier waves of digital technology. But many other incumbents will adapt and thrive in the AI era.
The result won’t be a “singleton” that takes over the world, as predicted by the strong superintelligence thesis. Rather, we’ll get a pluralistic and competitive economy that’s not too different from the one we have now.
In 1945, the economist Freidrich Hayek wrote an essay called “The Use of Knowledge in Society.” It was a critique of socialists who called for economies to be centrally planned by national governments.
These socialists implicitly thought of a nation’s economy as a giant chess board. They envisioned government officials surveying the board and moving economic resources around like chess pieces. They thought brainy government officials would be able to allocate these resources more efficiently than private individuals and firms.
But Hayek predicted that the socialists’ schemes would fail because the knowledge required to make efficient economic decisions was spread out in bits and pieces across the economy. He argued that a decentralized market economy is needed to enable the “man on the spot” to put his knowledge and experience to use.
I see singularists as making a similar mistake. They assume that all problems can be solved with the application of enough brainpower. But for many problems, having the right knowledge matters more. And a lot of economically significant knowledge is not contained in any public data set. It’s locked up in the brains and private databases of millions of individuals and organizations spread across the economy and around the world.
Thanks to Sean Trott, Sam Hammond, and Benjamin Riley for helpful comments on this essay. And thanks to Sean and Nick Bostrom for suggesting the distinction between fluid and crystalized intelligence.
I appreciate you writing this article! I've been wondering what your thoughts are on AI risk ever since you started the blog.
As some background, I first encountered you on Full Stack Economics, and when you announced this blog I subscribed here as well. Thus far I've found it very well-written and informative. In particular I loved your deep dives into self-driving technology, and found them very useful for forming my own opinions in that arena. You're one of my primary sources for news on contemporary AI developments, and I really appreciate the blog.
With that context, I want to say that I found this article to be very disappointing. It barely engages with the arguments in favor of AI risk, either handwaving them away without justification or omitting them entirely. Several sections even contain relatively simple mathematical errors that have nothing to do with AI in particular.
I'm writing up this comment because I believe AI to be by far the most impactful technology on the horizon, and it's vital that we can make good predictions on its impacts. If AI is indeed a threat to humanity, that would eclipse the importance of nearly every other issue humanity faces, and would justify strong measures to prevent it. And if AI is *not* such a threat, it has the potential to end poverty and war, saving millions of lives. In the latter case, we have a responsibility to develop it as quickly as possible. Figuring out which prediction is correct is *really important*.
To address things one at a time:
Chess:
You say that people have been mislead by chess, because chess follows simple deterministic rules and can therefore be solved by algorithms, which doesn't apply to the real world. This is a category error; there's no sharp delineation between those two domains. The real-world, just like chess, follows a set of relatively simple deterministic rules called "physics". Each "move" leads to a known outcome, which can be brute-force searched.
The difference, of course, is that the real-world game tree is vastly larger. The average move in a game of chess has about 35 options, compared to the number of particles in the observable universe is about 10^80. However this is less relevant than you might think, since chess's game tree is *already* large enough to be intractable to brute-force searches deeper than just a few moves as in your computer science class. Chess-playing algorithms succeeded by doing aggressive tree-pruning to get the search space down to a manageable size, along with heuristic arguments hardcoded in by human experience.
The piece valuation you used in your program is exactly such a fuzzy heuristic; nothing in the rules of chess assigns a value of "5" to a rook, and the actual usefullness of a rook varies wildly based on the exact position. Humans played thousands of games of chess, learned via trial and error and intuition how useful each piece was relative to each other piece, and then hardcoded that into their computers. A chess-playing algorithm like yours is *already* doing exactly the sort of knowledge-based heuristic approach that you claim computers aren't good at.
Early chess-playing pioneers like Deep Blue did rely on humans to explicitly program in those heuristics; they weren't doing the foundational reasoning themselves. But that changed in 2017 with AlphaZero, which learns chess entirely from scratch via neural network. It trained by playing chess against itself for only 9 hours and was then pitted agains the best human-coded chess-playing program, StockFish. AlphaZero won 25 games to 3.
The sort of pure algorithmic approach to games that you describe can only be used on very simple games like tic-tac-toe, and most of the things that computers have recently started doing much better than humans at use fuzzy heuristics learned by trial and error, just like humans do. AlphaStar, for example, is a neural network that can play Starcraft better than almost all humans. (Starcraft has a vastly larger game tree than chess, being more akin to the real world in the precision with which different actions can differ, and is also a hidden-information game where the players have to reason probabilistically about what the opponents have access to or may do.) OpenAI Five does the same with Dota 2. And outside of video games, DALL-E has far surpassed human artists in generality and visual beauty and fidelity. (It's still very poor at understanding an English description and converting that to a conceptually corresponding image, but that's a different skill.)
Your understanding of the real world also seems quite simplistic in certain domains. You say "The simplicity and predictability of chess allow computers to “look ahead” and anticipate the likely consequences of any potential move. Most real-world problems are not like that." and talk about military planning as an example of this; much of military strategy is doing exactly what you claim they don't do! The field of mathematical game theory was developed largely as a way to predict the actions of other nation-states in response to possible decisions, just like one does in chess. As you point out, real-world planning is a partial information game rather than a perfect-information game like chess, but that doesn't really have anything to do with the ability to plan ahead. Planning ahead in a hidden-information game looks very much the same as in chess, except that you ascribe probabilities to each of your opponent's moves and calculate the move you can take with the highest expected value.
There's a reason why game theory and wargaming both have "game" in their names; there's no sharp delineation between "game" and "geopolitics"; they're both complicated systems of rules, agents, incentives, and payoffs. Geopolitics is the same kind of thing as board games, just a more complicated instance.
Knowledge vs. computation:
If I understand your argument correctly, it's that general artificial intelligence will require more training data than humans currently have available to give it, and that much of the data we do have is redundant.
I think you actually understate part of this argument. The first important question is whether neural networks are capable of general intelligence *at all*. Our understanding of the human brain is extremely poor, and while neural networks are similar to them in many ways, they're also different in many ways. It's entirely possible that no amount of training data could ever get a neural network to human-level intelligence. (For more on this I'd highly recommend the debate between Scott Alexander and Gary Marcus: https://www.astralcodexten.com/p/somewhat-contra-marcus-on-ai-scaling)
But assuming that neural networks are capable in theory of general intelligence, it seems unjustified to point to limited training data as a relevant constraint.
* You point to a paper that estimates we'll run out of training data by 2026. This may be true, but what about the ~2.5 years before that happens? We've already seen dramatic improvement from GPT-2 to GPT-4, and if there is some point at which the amount of training data becomes "enough", you haven't provided any estimate of where exactly that point is, and it's entirely possible that it's above GPT-4 but below the total amount of data we have to throw at GPT-5.
* Humans are generating data at a frantic rate that's only increasing as the internet plays a larger and larger part of our lives. We may "run out" of unused training data in 2026, but that would only limit growth in training dataset size to the amount of data that humanity produces in a year, which is... a lot. Even if the amount of data needed for GAI is above the 2026 threshold, we'll still get there eventually, potentially only a few years later.
* You focus on human-created data, such as English passages. This is presumably because current leading AI models are language models, which is because that's what people want. AIs that can predict human language would be very useful to humanity, so that's where most of the funding goes. But when we're talking about *general* intelligence, capable of reasoning about the world from first principles and learning in much the same way that a human baby does, why would it need to be training on human language to start out with? There's nothing fundamentally special about humans, we're just a particularly complicated part of physics. The Large Hadron Collider produces more than 1 petabyte of data *per day*. The Event Horizon Telescope collected 5.5 petabytes of data in April of 2018. What happens when someone pipes all of that into a massive AI model? Nobody's done it yet because anything short of general intelligence will be unhelpful to the physics community, so the funding just isn't there. But if the rapid pace of increasing interest in AI continues, someone will do it eventually, and an AI capable of predicting physics is also capable of predicting human behavior as a side effect, since humans run on physics.
(Continued in a reply, I ran into the comment length limit.)
I'm sympathetic to the overall argument, but if one person could reach the frontiers of (written) knowledge in every field at the same time, they could probably come up with a lot of novel ideas. Actual academic disciplines remain very siloed, useful human lifespans are pretty short if you consider it takes maybe 10 years to reach the frontier of a PhD-narrow field, and the incentives are very much against reaching that frontier in (superficially) unrelated disciplines.