Joe Biden's ambitious plan to regulate AI, explained
Foundation models larger than GPT-4 will get mandatory red-team testing.
When President Joe Biden signed an executive order to regulate artificial intelligence yesterday, he called it “the most significant action any government, anywhere in the world, has ever taken on AI safety, security, and trust.” That may not have been an exaggeration.
The order runs for more than 100 pages and has a wide range of objectives, from fighting algorithmic discrimination to easing immigration for those with AI skills. But Biden’s most significant action is to invoke emergency powers to impose new regulations on so-called foundation models.
Going forward, anyone building a model significantly more powerful than GPT-4 will be required to conduct red-team safety tests and report the results of those tests to the federal government. American cloud providers will also be required to monitor their foreign customers and report to the US government if they appear to be training large models.
Models that analyze biological sequence data—which could be used in bioterrorism—get extra scrutiny. The order starts developing a regulatory framework for better oversight of biology labs that could be used to synthesize dangerous pathogens designed with future AI systems.
The new rules are based on the Defense Production Act and the International Emergency Economic Powers Act, two laws that were intended to give the president broad emergency powers during a war or other international crisis.
It’s not obvious that the invention of large language models qualifies as such an emergency. But in recent years it has become common for presidents to invoke these laws to deal with domestic challenges. Donald Trump used the DPA to force American companies to produce masks and ventilators in 2020, and Joe Biden used the DPA last year to speed up production of baby formula. Now Biden is invoking it to regulate the emerging AI industry, bypassing a Congress that was nowhere close to passing legislation in this area.
Biden’s quick and decisive action represents a triumph for singularists who believe that AI could pose an existential risk to humanity in the next decade. A year ago, hardly anyone in Washington was taking this risk seriously. Now—assuming Biden’s order is upheld by the courts—we have the beginnings of a legal framework to guard against that danger.
The 1026 club
Supporters of regulation have long advocated a focus on foundation models—powerful models like GPT-4 and Claude 2 that have shown a remarkable ability to generalize from their training data to solve new problems. But it seems tricky to define which foundation models need oversight. An overly broad definition could burden small organizations building harmless models, while a narrow definition could let dangerous models slip through the cracks.
The Biden executive order deals with this in two ways. In the short term, it just sets an arbitrary threshold: if a model required more than 1026 mathematical operations to train, then it will be subject to the new rules.
1026 is 100 trillion trillion operations. GPT-4 is reumored to have required roughly 20 trillion trillion operations. So, the new rules will likely apply to any future model that required several times more computing power than GPT-4. Given the exponential growth in computing power, it seems likely that the next generation of foundation models from OpenAI and its competitors will be subject to the rules.
The order also requires reporting by any computing cluster with more than 100 Gbit/s of internal bandwidth and “theoretical maximum computing capacity” of 1020 mathematical operations per second.
Down the road, these crude numerical thresholds will be superseded by new standards drafted by the Commerce Department. Future algorithmic improvements may make it possible to achieve performance better than GPT-4 with significantly less than 1026 mathematical operations of training. So it makes sense for the threshold to be refined over time.
Organizations building these foundation models will be required to report regularly to the federal government about the steps they are taking to keep their models secure against physical and cybersecurity threats. They’ll also need to perform red-team safety tests using guidance being developed by the National Institute of Standards and Technology.
A framework for future oversight
These rules will have little impact in the short term. As far as I can tell, no currently released foundation models were trained on more than 1026 operations. Moreover, the organizations most likely to train such models in the next year or two—including OpenAI, Anthropic, Meta, and Google’s DeepMind—already signed on to a July White House statement committing to test their models before deploying them.
But the order sets a precedent that could be hugely significant over the longer term. Available computing power is likely to continue growing exponentially, so while 1026 operations is a fairly high threshold today, it may not be such a high bar in five or ten years. And once the basic legal framework is in place—and upheld by the courts—it will be relatively easy to update the rules to cover more models.
A big open question is how these rules will be applied to openly-licensed models. The Biden framework implicitly assumes that a foundation model will be created by a single company that trains it, tests it, and submits required reports to the government. The rules about securing the weights seems to assume that companies will want to keep their model weights secret.
But there’s a thriving community of technologists who train models and publish the weights for anyone to use and modify. Right now, none of those models are large enough to trigger the reporting requirements. But in a few years, people might want to publish the weights of models that are above the reporting threshold. Will that be allowed? It’s not clear.
If it is allowed, the government will need to decide whether someone who modifies someone else’s model needs to conduct their own red-team analysis. If the answer is “yes,” it could create a huge burden for small companies and flood the government with paperwork. If the answer is “no,” on the other hand, it could create an easy way for malicious actors to evade the reporting requirements.
The order also establishes a new framework for regulating foreigners who try to train powerful models using US cloud services. American companies that provide infrastructure as a service are required to identify their foreign customers and then monitor their activities. If they appear to be training a powerful foundation model, the company must report this activity to the US government.
This seems like another change that could have far-reaching implications in the long run. This requirement doesn’t seem to be limited to large cloud providers, so many US companies could wind up with a new reporting obligation. Moreover, once the infrastructure is in place to do this kind of surveillance, the US government might be very tempted to tack on additional reporting requirements that might have little to do with AI.
A focus on national security
The provisions requiring testing of foundation models focus on three big threats: weapons of mass destruction, cybersecurity, and “evasion of human control” through “deception or obfuscation.”
The weapon provision seems straightforward. Authorities are worried about a terrorist using a chatbot to learn how to create a nuclear, biological, radiological, or chemical weapon. To guard against this, companies will need to hire red teams to see if they can get these systems to provide information about how to make a nuclear weapon or a dangerous pathogen.
It is not clear how serious a threat this is, at least in the short run. Current chatbots seem to largely summarize facts from their training data, which is overwhelmingly harvested from the open web. A determined terrorist may be able to find all of the same information using ordinary Google searches.
However, there are early signs that future AI systems could be more dangerous than that—especially in the realm of biology. For example, a couple of years ago a transformer-based AI system solved the protein folding problem, which had vexed human researchers for decades. It’s not that hard to imagine future AI systems being better at humans at predicting how to modify a virus to make it more dangerous.
To guard against this threat, the Biden administration sets a much lower numeric threshold—1023 operations—for AI models that are trained on biological sequence data. The order also begins creating infrastructure and guidelines to help biology labs to identify and screen out requests to synthesize dangerous genetic sequences.
I view regulation of biology labs as one of the most sensible proposals in the Biden executive order. Rather than trying to prevent the development or wide availability of powerful AI systems—which seems hopeless—it makes more sense to limit the damage people can do with AI by locking down the physical world. Regulating biology labs that synthesize nucleic acide sequences seems like a great place to start.
The order takes a similar tack to defend against AI-enabled cyberattacks. It instructs every agency with authority over critical infrastructure to consider how “deploying AI may make critical infrastructure systems more vulnerable to critical failures, physical attacks, and cyber attacks, and shall consider ways to mitigate these vulnerabilities.” Again, I like the focus on critical infrastructure here because locking down the physical world will help minimize the damage that rogue AI systems (or rogue people using AI systems) can do.
The order has much less to say about AI systems evading human control by deceiving people. It’s a hard problem. An AI system that’s smart enough to deceive people “in the wild” may also be smart enough to deceive people conducting red-team tests. So it’s not clear how much federal action can help here.
A lot of AI regulation is coming
To keep this article a manageable length, I won’t try to summarize all the other issues addressed by Biden’s order. There is language on immigration, copyright and patent law, discrimination and bias by AI systems, AI-generated misinformation, and a lot more.
The Biden Administration plans to spend money to beef up the federal government’s AI capabilities. This will include trying to recruit more AI experts into government, encouraging government agencies to use AI technology, and creating a National AI Research Resource—a large government-run computing cluster that could be made available to AI researchers in and out of government.
Many provisions of yesterday’s executive order don’t do anything directly. Instead, they instruct some other government official to draft a report or begin a rulemaking process. So over the next year, we’re going to see many government agencies publishing reports or proposing regulations in response to Biden’s order.
Personally, I would have preferred to have more of this activity initiated by Congress, which is supposed to be the branch of government that drafts legislation and decides how to spend money. But Congress has become so dysfunctional that it now struggles with basic tasks like passing annual spending bills. I personally found it hard to imagine Congress passing AI legislation before the start of the next session in 2025.
And the president apparently feels that this issue is too urgent to wait for Congress to get its act together. So he pushed forward on his own. It may fall to the courts to decide if he has the authority to do so.
Very useful article. I’d like to see an explainer of what red team testing involves. I’ve read some very brief ones but they really didn’t help for a layman like me. Either way, thanks for the very useful work.
I think these precautions are unlikely to do anything but slow down the development of AI. The government is not going to have anyone who can judge whether a chatbot is dangerous or not. All the experts are in private companies. The regulators will only be able to say "I don't know what it will do, but it might be bad, so I'm not going to allow it."