Policy makers should plan for superintelligent AI, even if it never happens

By Zachary Kallenborn | December 21, 2023

Robot playing chess. Credit: Vchalup via Adobe

Experts from around the world are sounding alarm bells to signal the risks artificial intelligence poses to humanity. Earlier this year, hundreds of tech leaders and AI specialists signed a one-sentence letter released by the Center for AI Safety that read “mitigating the risk of extinction from AI should be a global priority alongside other societal-scale risks such as pandemics and nuclear war.” In a 2022 survey, half of researchers indicated they believed there’s at least a 10 percent chance human-level AI causes human extinction. In June, at the Yale CEO summit, 42 percent of surveyed CEOs indicated they believe AI could destroy humanity in the next five to 10 years.

These concerns mainly pertain to artificial general intelligence (AGI), systems that can rival human cognitive skills and artificial superintelligence (ASI), machines with capacity to exceed human intelligence. Currently no such systems exist. However, policymakers should take these warnings, including the potential for existential harm, seriously.

Because the timeline, and form, of artificial superintelligence is uncertain, the focus should be on identifying and understanding potential threats and building the systems and infrastructure necessary to monitor, analyze, and govern those risks, both individually and as part of a holistic approach to AI safety and security. Even if artificial superintelligence does not manifest for decades or even centuries, or at all, the magnitude and breadth of potential harm warrants serious policy attention. For if such a system does indeed come to fruition, a head start of hundreds of years might not be enough.

Prioritizing artificial superintelligence risks, however, does not mean ignoring immediate risks like biases in AI, propagation of mass disinformation, and job loss. An artificial superintelligence unaligned with human values and goals would super charge those risks, too. One can easily imagine how Islamophobia, antisemitism, and run-of-the-mill racism and bias—often baked into AI training data—could affect the system’s calculations on important military or diplomatic advice or action. If not properly controlled, an unaligned artificial superintelligence could directly or indirectly cause genocide, massive job loss by rendering human activity worthless, creation of novel biological weapons, and even human extinction.

The threat. Traditional existential threats like nuclear or biological warfare can directly harm humanity, but artificial superintelligence could create catastrophic harm in myriad ways. Take for instance an artificial superintelligence designed to protect the environment and preserve biodiversity. The goal is arguably a noble one: A 2018 World Wildlife Foundation report concluded humanity wiped out 60 percent of global animal life just since 1970, while a 2019 report by the United Nations Environment Programme showed a million animal and plant species could die out in decades. An artificial superintelligence could plausibly conclude that drastic reductions in the number of humans on Earth—perhaps even to zero—is, logically, the best response. Without proper controls, such a superintelligence might have the ability to cause those “logical” reductions.

A superintelligence with access to the Internet and all published human material would potentially tap into almost every human thought—including the worst of thought. Exposed to the works of the Unabomber, Ted Kaczynski, it might conclude the industrial system is a form of modern slavery, robbing individuals of important freedoms. It could conceivably be influenced by Sayyid Qutb, who provided the philosophical basis for al-Qaeda, or perhaps by Adolf Hitler’s Mein Kampf, now in the public domain.

The good news is an artificial intelligence—even a superintelligence—could not manipulate the world on its own. But it might create harm through its ability to influence the world in indirect ways. It might persuade humans to work on its behalf, perhaps using blackmail. Or it could provide bad recommendations, relying on humans to implement advice without recognizing long-term harms. Alternatively, artificial superintelligence could be connected to physical systems it can control, like laboratory equipment. Access to the Internet and the ability to create hostile code could allow a superintelligence to carry out cyber-attacks against physical systems. Or perhaps a terrorist or other nefarious actor might purposely design a hostile superintelligence and carry out its instructions.

Introduction: The Hype, Peril, and Promise of Artificial Intelligence

That said, a superintelligence might not be hostile immediately. In fact, it may save humanity before destroying it. Humans face many other existential threats, such as near-Earth objects, super volcanos, and nuclear war. Insights from AI might be critical to solve some of those challenges or identify novel scenarios that humans aren’t aware of. Perhaps an AI might discover novel treatments to challenging diseases. But since no one really knows how a superintelligence will function, it’s not clear what capabilities it needs to generate such benefits.

The immediate emergence of a superintelligence should not be assumed. AI researchers differ drastically on the timeline of artificial general intelligence, much less artificial superintelligence. (Some doubt the possibility altogether.) In a 2022 survey of 738 experts who published during the previous year on the subject, researchers estimated a 50 percent chance of “high-level machine intelligence” by 2059. In an earlier, 2009 survey, the plurality of respondents believed an AI capable of Nobel Prize winner-level intelligence would be achieved by the 2020s, while the next most common response was Nobel-level intelligence would not come until after the 2100 or never.

As philosopher Nick Bostrom notes, takeoff could occur anywhere from a few days to a few centuries. The jump from human to super-human intelligence may require additional fundamental breakthroughs in artificial intelligence. But a human-level AI might recursively develop and improve its own capabilities, quickly jumping to super-human intelligence.

There is also a healthy dose of skepticism regarding whether artificial superintelligence could emerge at all in the near future, as neuroscientists acknowledge knowing very little about the human brain itself, let alone how to recreate or better it. However, even a small chance of such a system emerging is enough to take it seriously.

Policy response. The central challenge for policymakers in reducing artificial superintelligence-related risk is grappling with the fundamental uncertainty about when and how these systems may emerge balanced against the broad economic, social, and technological benefits that AI can bring. The uncertainty means that safety and security standards must adapt and evolve. The approaches to securing the large language models of today may be largely irrelevant to securing some future superintelligence-capable model. However, building policy, governance, normative, and other systems necessary to assess AI risk and to manage and reduce the risks when superintelligence emerges can be useful—regardless of when and how it emerges. Specifically, global policymakers should attempt to:

Characterize the threat. Because it lacks a body, artificial superintelligence’s harms to humanity are likely to manifest indirectly through known existential risk scenarios or by discovering novel existential risk scenarios. How such a system interacts with those scenarios needs to be better characterized, along with tailored risk mitigation measures. For example, a novel biological organism that is identified by an artificial superintelligence should undergo extensive analysis by diverse, independent actors to identify potential adverse effects. Likewise, researchers, analysts, and policymakers need to identify and protect, to the extent that’s possible, critical physical facilities and assets—such as biological laboratory equipment, nuclear command and control infrastructure, and planetary defense systems—through which an uncontrolled AI could create the most harm.

Monitor. The United States and other countries should conduct regular comprehensive surveys and assessment of progress, identify specific known barriers to superintelligence and advances towards resolving them, and assess beliefs regarding how particular AI-related developments may affect artificial superintelligence-related development and risk. Policymakers could also establish a mandatory reporting system if an entity hits various AI-related benchmarks up to and including artificial superintelligence.

A monitoring system with pre-established benchmarks would allow governments to develop and implement action plans for when those benchmarks are hit. Benchmarks could include either general progress or progress related to “specifically dangerous capabilities,” such as the capacity to enable a non-expert to design, develop, and deploy novel biological or chemical weapons, or developing and using novel offensive cyber capabilities. For example, the United States might establish safety laboratories with the responsibility to critically evaluate a claimed artificial general intelligence against various risk benchmarks, producing an independent report to Congress, federal agencies, or other oversight bodies. The United Kingdom’s new AI Safety Institute could be a useful model.

AI and atoms: How artificial intelligence is revolutionizing nuclear material

Debate. A growing community concerned about artificial superintelligence risks are increasingly calling for decelerating, or even pausing, AI development to better manage the risks. In response, the accelerationist community is advocating speeding up research, highlighting the economic, social, and technological benefits AI may unleash, while downplaying risks as an extreme hypothetical. This debate needs to expand beyond techies on social media to global legislatures, governments, and societies. Ideally, that discussion should center around what factors would cause a specific AI system to be more, or less, risky. If an AI possess minimal risk, then accelerating research, development, and implementation is great. But if numerous factors point to serious safety and security risks, then extreme care, even deceleration, may be justified.

Build global collaboration. Although ad hoc summits like the recent AI Safety Summit is a great start, a standing intergovernmental and international forum would enable longer-term progress, as research, funding, and collaboration builds over time. Convening and maintaining regular expert forums to develop and assess safety and security standards, as well as how AI risks are evolving over time, could provide a foundation for collaboration. The forum could, for example, aim to develop standards akin to those applied to biosafety laboratories with scaling physical security, cyber security, and safety standards based on objective risk measures. In addition, the forum could share best practices and lessons learned on national-level regulatory mechanisms, monitor and assess safety and security implementation, and create and manage a funding pool to support these efforts. Over the long-term, once the global community coalesces around common safety and security standards and regulatory mechanisms, the United Nations Security Council (UNSC) could obligate UN member states to develop and enforce those mechanisms, as the Security Council did with UNSC Resolution 1540 mandating various chemical, biological, radiological, and nuclear weapons nonproliferation measures. Finally, the global community should incorporate artificial superintelligence risk reduction as one aspect in a comprehensive all-hazards approach, addressing common challenges with other catastrophic and existential risks. For example, the global community might create a council on human survival aimed at policy coordination, comparative risk assessment, and building funding pools for targeted risk reduction measures.

Establish research, development, and regulation norms within the global community. As nuclear, chemical, biological, and other weapons have proliferated, the potential for artificial superintelligence to proliferate to other countries should be taken seriously. Even if one country successfully contains such a system and harnesses the opportunities for social good, others may not. Given the potential risks, violating AI-related norms and developing unaligned superintelligence should justify violence and war. The United States and the global community have historically been willing to support extreme measures to enforce behavior and norms concerning less risky developments. In August 2013, former President Obama (in)famously drew a red line on Syria’s use of chemical weapons, noting the Assad regime’s use would lead him to use military force in Syria. Although Obama later demurred, favoring a diplomatic solution, in 2018 former President Trump later carried out airstrikes in response to additional chemical weapons usage. Likewise, in Operation Orchard in 2007, the Israeli Air Force attacked the Syrian Deir ez-Zor site, a suspected nuclear facility aimed at building a nuclear weapons program.

Advanced artificial intelligence poses significant risks to the long-term health and survival of humanity. However, it’s unclear when, how, or where those risks will manifest. The Trinity Test of the world’s first nuclear bomb took place almost 80 years ago, and humanity has yet to contain the existential risk of nuclear weapons. It would be wise to think of the current progress in AI as our Trinity Test moment. Even if superintelligence takes a century to emerge, 100 years to consider the risks and prepare might still not be enough.

Thanks to Mark Gubrud for providing thoughtful comments on the article.

Together, we make the world safer.

The Bulletin elevates expert voices above the noise. But as an independent nonprofit organization, our operations depend on the support of readers like you. Help us continue to deliver quality journalism that holds leaders accountable. Your support of our work at any level is important. In return, we promise our coverage will be understandable, influential, vigilant, solution-oriented, and fair-minded. Together we can make a difference.

Get alerts about this thread
Notify of
1 Comment
Newest Most Voted
Inline Feedbacks
View all comments
Alfonso R. Reyes
Alfonso R. Reyes
2 months ago

Such an apocalyptic article with no references to back it up. The fact that 42 percent of CEOs at a Yale summit are of the opinion that humanity could be destroyed by AI within the next 10 years, shows how uninformed the business elites are. The anxiety on a superhuman intelligence has been fed the past 3 years by marketing and sales departments to sell old applications with a new label. Everything labeled “smart” or “intelligent” will sell more than standard items, while at the same time “transmitting” and aura of intelligence to the buyer. I recommend people go and… Read more »