Giving an AI control of nuclear weapons: What could possibly go wrong?

By Zachary Kallenborn, February 1, 2022

The "nuclear football" follows the president on trips. It allows the president to authorize a nuclear launch.

If artificial intelligences controlled nuclear weapons, all of us could be dead.

That is no exaggeration. In 1983, Soviet Air Defense Forces Lieutenant Colonel Stanislav Petrov was monitoring nuclear early warning systems, when the computer concluded with the highest confidence that the United States had launched a nuclear war. But Petrov was doubtful: The computer estimated only a handful of nuclear weapons were incoming, when such a surprise attack would more plausibly entail an overwhelming first strike. He also didn’t trust the new launch detection system, and the radar system didn’t have corroborative evidence. Petrov decided the message was a false positive and did nothing. The computer was wrong; Petrov was right. The false signals came from the early warning system mistaking the sun’s reflection off the clouds for missiles. But if Petrov had been a machine, programmed to respond automatically when confidence was sufficiently high, that error would have started a nuclear war.

Militaries are increasingly incorporating autonomous functions into weapons systems, though as far as is publicly known, they haven’t yet turned the nuclear launch codes over to an AI system. Russia has developed a nuclear-armed, nuclear-powered torpedo that is autonomous in some not publicly known manner, and defense thinkers in the United States have proposed automating the launch decision for nuclear weapons.

There is no guarantee that some military won’t put AI in charge of nuclear launches; International law doesn’t specify that there should always be a “Petrov” guarding the button. That’s something that should change, soon.

How autonomous nuclear weapons could go wrong. The huge problem with autonomous nuclear weapons, and really all autonomous weapons, is error. Machine learning-based artificial intelligences—the current AI vogue—rely on large amounts of data to perform a task. Google’s AlphaGo program beat the world’s greatest human go players, experts at the ancient Chinese game that’s even more complex than chess, by playing millions of games against itself to learn the game. For a constrained game like Go, that worked well. But in the real world, data may be biased or incomplete in all sorts of ways. For example, one hiring algorithm concluded being named Jared and playing high school lacrosse was the most reliable indicator of job performance, probably because it picked up on human biases in the data.

In a nuclear weapons context, a government may have little data about adversary military platforms; existing data may be structurally biased, by, for example, relying on satellite imagery; or data may not account for obvious, expected variations such as imagery in taken during foggy, rainy, or overcast weather.

The nature of nuclear conflict compounds the problem of error.

How would a nuclear weapons AI even be trained? Nuclear weapons have only been used twice in Hiroshima and Nagasaki, and serious nuclear crises are (thankfully) infrequent. Perhaps inferences can be drawn from adversary nuclear doctrine, plans, acquisition patterns, and operational activity, but the lack of actual examples of nuclear conflict means judging the quality of those inferences is impossible. While a lack of examples hinders humans too, humans have the capacity for higher-order reasoning. Humans can create theories and identify generalities from limited information or information that is analogous, but not equivalent. Machines cannot.

The deeper challenge is high false positive rates in predicting rare events. There have thankfully been only two nuclear attacks in history. An autonomous system designed to detect and retaliate against an incoming nuclear weapon, even if highly accurate, will frequently exhibit false positives. Around the world, in North Korea, Iran, and elsewhere, test missiles are fired into the sea and rockets are launched into the atmosphere. And there have been many false alarms of nuclear attacks, vastly more than actual attacks. An AI that’s right almost all the time still has a lot of opportunity to get it wrong. Similarly, with a test that accurately diagnosed cases of a rare disease 99 percent of the time, a positive diagnosis may mean just a 5 percent likelihood of actually having the disease, depending on assumptions about the disease’s prevalence and false positive rates. This is because with rare diseases, the number of false positives could vastly outweigh the number of true positives. So, if an autonomous nuclear weapon concluded with 99 percent confidence a nuclear war is about to begin, should it fire?

In the extremely unlikely event those problems can all be solved, autonomous nuclear weapons introduce new risks of error and opportunities for bad actors to manipulate systems. Current AI is not only brittle; it’s easy to fool. A single pixel change is enough to convince an AI a stealth bomber is a dog. This creates two problems. If a country actually sought a nuclear war, they could fool the AI system first, rendering it useless. Or a well-resourced, apocalyptic terrorist organization like the Japanese cult Aum Shinrikyo might attempt to trick an adversary’s system into starting a catalytic nuclear war. Both approaches can be done in quite subtle, difficult-to-detect ways: data poisoning may manipulate the training data that feeds the AI system, or unmanned systems or emitters could be used to trick an AI into believing a nuclear strike is incoming.

Memo to Trump: Develop specific AI guidelines for nuclear command and control

The risk of error can confound well-laid nuclear strategies and plans. If a military had to start a nuclear war, targeting an enemy’s own nuclear systems with gigantic force would be a good way to go to limit retaliation. However, if an AI launched a nuclear weapon in error, the decisive opening salvo may be a pittance—a single nuclear weapon aimed at a less than ideal target. Accidentally nuking a major city might provoke an overwhelming nuclear retaliation because the adversary would still have all its missile silos, just not its city.

Some have nonetheless argued that autonomous weapons (not necessarily autonomous nuclear weapons) will eventually reduce the risk of error. Machines do not need to protect themselves and can be more conservative in making decisions to use force. They do not have emotions that cloud their judgement and do not exhibit confirmation bias—a type of bias in which people interpret data in a way that conforms to their desires or beliefs.

While these arguments have potential merit in conventional warfare, depending on how technology evolves, they do not in nuclear warfare. As strategic deterrents, countries have strong incentives to protect their nuclear weapons platforms, because they literally safeguard their existence. Instead of being risk avoidant, countries have an incentive to preemptively launch under attack, because otherwise they may lose their nuclear weapons. Some emotion should also be a part of nuclear decision-making: the prospect of catastrophic nuclear war should be terrifying, and the decision made extremely cautiously.

Finally, while autonomous nuclear weapons may not exhibit confirmation biases, the lack of training data and real-world test environments mean an autonomous nuclear weapon may experience numerous biases, which may never be discovered until after a nuclear war has started.

The decision to unleash nuclear force is the single most significant decision a leader can make. It commits a state to an existential conflict with millions—if not billions—of lives in the balance. Such a consequential, deeply human decision should never be made by a computer.

Activists against autonomous weapons have been hesitant to focus on autonomous nuclear weapons. For example, the International Committee of the Red Cross makes no mention of autonomous nuclear weapons in its position statement on autonomous weapons. (In fairness, the International Committee for Robot Arms Control’s 2009 statement references autonomous nuclear weapons, though it represents more of the intellectual wing of the so-called “stop killer robots” movement.) Perhaps activists see nuclear weapons as already broadly banned or do not wish to legitimize nuclear weapons generally, but the lack of attention is a mistake. Nuclear weapons already have broad established norms against their use and proliferation, with numerous treaties supporting them. Banning autonomous nuclear weapons should be an easy win to establish norms against autonomous weapons. Plus, autonomous nuclear weapons represent perhaps the highest-risk manifestation of autonomous weapons (an artificial superintelligence is the only potential higher risk). Which is worse: an autonomous gun turret accidently killing a civilian, or an autonomous nuclear weapon igniting a nuclear war that leads to catastrophic destruction and possibly the extinction of all humanity? Hint: catastrophic destruction is vastly worse.

Where autonomous nuclear weapons stand. Some autonomy in nuclear weapons is already here, but it’s complicated and unclear how worried we should be.

Russia’s Poseidon is an “Intercontinental Nuclear-Powered Nuclear-Armed Autonomous Torpedo” according to US Navy documents, while the Congressional Research Service has also described it as an “autonomous undersea vehicle.” The weapon is intended to be a second-strike weapon used in the event of a nuclear conflict. That is, a weapon intended to ensure a state can always retaliate against a nuclear strike, even an unexpected, so-called “bolt from the blue.” An unanswered question is: what can the Poseidon do autonomously? Perhaps the torpedo just has some autonomous maneuvering ability to better reach its target—basically, an underwater cruise missile. That’s probably not a big deal, though there may be some risk of error in misdirecting the attack.

It is more worrisome if the torpedo is given permission to attack autonomously under specific conditions. For example, what if, in a crisis scenario where Russian leadership fears a possible nuclear attack, Poseidon torpedoes are launched under a loiter mode? It could be that if the Poseidon loses communications with its host submarine, it launches an attack. Most worrisome: The torpedo has the ability to attack on its own, but this possibility is quite unlikely. This would require an independent means for the Poseidon to assess whether a nuclear attack had taken place, while sitting far beneath the ocean. Of course, given how little is known about the Poseidon, this is all speculation. But that’s part of the point: understanding how another country’s autonomous systems operate is really hard.

Memo to Trump: Strengthen deterrence with more autonomy for weapons systems

Countries are also interested in so-called “dead hand systems.” Dead hand systems are meant to provide a back-up, in case a state’s nuclear command authority is disrupted, or killed. A relatively simple system like Russia’s Perimeter might delegate launch authority to a lower-level commander in the event of a crisis and specific conditions like a loss of communication with command authorities. But as deterrence experts Adam Lowther and Curtis McGuffin argued in a 2019 article in War on the Rocks, the United States should consider “an automated strategic response system based on artificial intelligence.”

The authors reason the decision-making time to launch nuclear weapons has become so constrained, that an artificial intelligence-based “dead hand” should be considered, despite, as the authors acknowledge, the potential for numerous errors and problems the system would create. Lt. Gen. Jack Shanahan, former leader of the Department of Defense’s Joint Artificial Intelligence Center, shot the proposal down immediately: “You will find no stronger proponent of integration of AI capabilities writ large into the Department of Defense, but there is one area where I pause, and it has to do with nuclear command and control.” But Shanahan retired in 2020, and there is no reason to believe the proposal will not come up again. Perhaps next time, no one will shoot it down.

What needs to happen. As allowed under Article VIII of the Nuclear Non-Proliferation Treaty, a member state should propose an amendment to the treaty requiring all nuclear weapons states to always include humans within decision-making chains on the use of nuclear weapons. This could require diplomacy and might take a while. In the near term, countries should raise the issue when the member states next meet to review the treaty in August 2022 and establish a side-event focused on autonomous nuclear weapons issues during the 2025 conference. Even if a consensus cannot be established at the 2022 conference, countries can begin the process of working through any barriers in support of a future amendment. Countries can also build consensus outside the review conference process: Bans on autonomous nuclear weapons could be discussed as part of broader multilateral discussions on a new autonomous weapons ban.

The United States should be a leader in this effort. The congressionally-appointed National Security Commission on AI recommended humans maintain control over nuclear weapons. Page 12 notes, “The United States should (1) clearly and publicly affirm existing US policy that only human beings can authorize employment of nuclear weapons and seek similar commitments from Russia and China.” Formalizing this requirement in international law would make it far more robust.

Unfortunately, requiring humans to make decisions on firing nuclear weapons is not the end of the story. An obvious challenge is how to ensure the commitments to human control are trustworthy. After all, it is quite tough to tell whether a weapon is truly autonomous. But there might be options to at least reassure: Countries could pass laws requiring humans to approve decisions on the use of nuclear weapons; provide minimum transparency into nuclear command and control processes to demonstrate meaningful human control; or issue blanket bans on any research and development aimed at making nuclear weapons autonomous.

Now, none of this should suggest that any fusion of artificial intelligence and nuclear weapons is terrifying. Or, more precisely, any more terrifying than nuclear weapons on their own. Artificial intelligence also has applications in situational awareness, intelligence collection, information processing, and improving weapons accuracy. Artificial intelligence may aid decision support and communication reliability, which may help nuclear stability. In fact, artificial intelligence has already been incorporated in various aspects of nuclear command, control, and communication systems, such as early warning systems. But that should never extend to complete machine control over the decision to use nuclear weapons.

The challenge of autonomous nuclear weapons is a serious one that has gotten little attention. Making changes to the Nuclear Non-Proliferation Treaty to require nuclear weapons states to maintain human control over nuclear weapons is just the start. At the very least, if a nuclear war breaks out, we’ll know who to blame.

The author would like to thank Philipp C. Bleek, James Johnson, and Josh Pollack for providing invaluable input on this article.

As the coronavirus crisis shows, we need science now more than ever.

The Bulletin elevates expert voices above the noise. But as an independent, nonprofit media organization, our operations depend on the support of readers like you. Help us continue to deliver quality journalism that holds leaders accountable. Your support of our work at any level is important. In return, we promise our coverage will be understandable, influential, vigilant, solution-oriented, and fair-minded. Together we can make a difference.

Support the Bulletin

View Comments

Ian Alterman says:

February 1, 2022 at 4:47 pm

This is, of course, the premise behind the Terminator film series; the creation of "Skynet," an AI system that controls nuclear weapons. In the films, Skynet becomes "self-aware," and when humans try to remove it from the launch system, it defends itself by launching a full-scale nuclear attack in order to kill the humans "attacking" it. This, of course, leads to global nuclear war, and the "rise of the machines" that is the "reason" for the entire film series.

Apparently, life may end up imitating art - in the most disastrous way possible.