Why superintelligence is a threat that should be taken seriously

By Émile P. Torres, October 24, 2017

How emerging technology is shaping the future of intelligence.

In a recent article for Skeptic, Michael Shermer (the magazine’s founding publisher) put forth an argument for “why AI is not an existential threat,” where “AI” stands for “artificial intelligence” and an “existential threat” is anything that could cause human extinction or the irreversible decline of civilization.

Unfortunately, Shermer’s argument is a jumble of misconstrued ideas that have little or nothing to do with the actual problems of “AI safety” research. To be clear, there are many possible societal consequences of developing AI systems, including job losses through automation and future wars involving lethal autonomous weapons. But the most significant worry stems from an AI system with greater-than-human intelligence, or a “superintelligence.” It is the creation of such a system that has led many business leaders and scholars—including Elon Musk, Bill Gates, Stephen Hawking, and Nick Bostrom—to identify superintelligence as one of the greatest possible existential threats facing humanity. Given the potential significance of this topic, I would like to correct a few of Shermer’s most serious mistakes.

Don’t anthropomorphize! The first half of Shermer’s article correctly identifies some of the issues that worry AI experts. But as soon as Shermer turns to criticizing these issues, his argument quickly goes off the rails.

For example, he claims that “most AI doomsday prophecies are grounded in the false analogy between human nature and computer nature, or natural intelligence and artificial intelligence.” This could hardly be more wrong. One of the most commonly emphasized points in the AI safety literature is that one must never anthropomorphize AI—that is, don’t project human mental properties onto artificially intelligent systems.

The “cognitive architecture,” as scholars put it, of an AI might be entirely different than the cognitive architecture of humans, which was shaped and molded over millions of years by natural selection. Indeed, the difference between natural intelligence and artificial intelligence is one of the primary reasons that many experts are worried: A superintelligent AI with a completely different cognitive architecture than that of humans could behave in ways that we are fundamentally unable to predict or understand.

The myth of an “evil” superintelligence. Second, Shermer seems to believe that a superintelligence would need consciousness and/or emotions for it to pose a threat to humanity. He writes:

To believe that an [artificial superintelligence] would be “evil” in any emotional sense is to assume a computer cognition that includes such psychological traits as acquisitiveness, competitiveness, vengeance, and bellicosity, which seem to be projections coming from the mostly male writers who concoct such dystopias, not features any programmer would bother including, assuming that it could even be done. What would it mean to program an emotion into a computer?

To buttress this claim, Shermer cites Harvard psychologist Steven Pinker’s 2015 answer to the question “What do you think about machines that think?”:

[Artificial intelligence] dystopias project a parochial alpha-male psychology onto the concept of intelligence. They assume that superhumanly intelligent robots would develop goals like deposing their masters or taking over the world. But intelligence is the ability to deploy novel means to attain a goal; the goals are extraneous to the intelligence itself. Being smart is not the same as wanting something . . . It’s telling that many of our techno-prophets don’t entertain the possibility that artificial intelligence will naturally develop along female lines: fully capable of solving problems, but with no desire to annihilate innocents or dominate the civilization.

I am immediately reminded of one AI safety researcher’s response to Pinker’s argument: “Will someone please tell [Pinker] to read… gosh, anything on the issue that isn’t a news story?” The reason for this exasperation is that no serious AI expert is worried about an “evil” superintelligence takeover, nor are experts worried about an alpha-male AI scenario. The MIT physics professor and co-founder of the Future of Life Institute, Max Tegmark, identifies this as one of the top myths about advanced AI. (Note that I have previously written for the institute.)

The problem of misaligned goals. The important issue that Shermer notes in his article, but then fails to address, concerns the possibility of a superintelligence whose goal system, or “values” for short, is misaligned with ours. There are several ideas to unpack here. First, consider a canonical analogy to explain why value alignment is such a big deal: Think about the existential threat that humans pose to ant colonies. The goal of ants is to create underground colonies whereas the goal of humans, in this example, is to create suburban neighborhoods. These goals are misaligned. Now, since intelligence yields power—the smarter an organism is, the more effectively it can modify its environment to achieve its goals, whatever they are—our superior intelligence enables us to modify the environment in more forceful ways (for example, using bulldozers) than a colony of ants ever could. The result is an ant genocide—not because we hate ants or because we’re “evil,” but simply because we are more powerful and have different values.

Thinking about this same example, but with human civilization as the ant colony and a superintelligence as humanity, the real danger comes into focus. A superintelligence whose goal system is even slightly misaligned with ours could, being far more powerful than any human or human institution, bring about human extinction for the very same reason that construction workers routinely slaughter large populations of ants. How could a superintelligence be superpowerful? It wouldn’t need a Terminator-like body for this. Rather, the fingers or tentacles of a superintelligence would be any electronic device or process within reach, from laboratory equipment to nuclear warning systems to satellites to the global economy, and so on.

Furthermore, since computers process information far faster than human brains, a superintelligence could manipulate the world at speeds that confound any human attempt to control it. According to my own calculations, if it takes an average of more than eight years to earn a PhD, a computer-simulated human brain could achieve this in less than five minutes.

Values are complex. It is crucial that we load the right value system into a superintelligent AI, or else we could end up like the ant colony. But what exactly are humanity’s values? This introduces a profoundly complicated issue. Indeed, philosophers have been trying to figure out the best systems of values—moral, legal, cultural, and so on—for more than 2,500 years, and still there is hardly any agreement among professionals!

Even more, it turns out that our values are far more complex in reality than they naively appear to us. For example, you wouldn’t want to give a self-driving car the simple goal of “drive me to the airport as quickly as possible,” because you might end up arriving covered in vomit, and possibly in some legal trouble if anyone was run over on the way. Instead, you would also have to load values like “don’t run over anyone,” “don’t drive so fast that I become sick,” “don’t take shortcuts through farm fields,” and so on.

The more experts think about this issue, the more they realize how challenging it will be to cover every conceivable situation so that nothing bad happens. And whereas the self-driving car example has a worst-case outcome of perhaps a handful of deaths, as the power of AI systems grows to the point at which they become superintelligent, the risks will increase as well—ultimately leading to situations in which, if we fail to load the right values into the AI, the result could be catastrophic on the global scale.

But the situation is even more problematic. As numerous AI safety experts have pointed out, it may not be enough for humanity to align the value system of a superintelligence 90 percent, just as dialing 9 out of 10 phone number digits won’t get you someone who’s 90 percent similar to the person you’re trying to call. In other words, our values are “fragile,” meaning that we may have to solve the age-old philosophical question of what our values should be entirely, rather than partially, by the time AI reaches a human level of general intelligence. Let’s get on it, philosophers!

Feelings are irrelevant. When Shermer writes that “the fear that computers will become emotionally evil are [sic] unfounded,” he sidesteps everything substantive about the actual debate among AI safety experts. He also misfires in claiming that, without emotions, a superintelligence would not be able to do anything. Quoting the science writer Michael Chorost, he argues that “until an AI has feelings, it’s going to be unable to want to do anything at all, let alone act counter to humanity’s interests and fight off human resistance.”

This is perhaps the most egregious error in the article. An AI system doesn’t need to have “feelings” any more than a bulldozer needs to have “feelings” to wreak havoc in a city after someone wedges the throttle forward with a shoe. It’s a complete red herring.

To use a computer-based analogy, does my word processor need emotions to correct misspelled words? Does my phone’s predictive text function need emotions to accurately anticipate what I’m typing? Does, as the existential risk scholar Olle Häggström asks, a heat-seeking missile need emotions to strike its target? The answer is “no.” The same goes for a superintelligence that has the simple goal of, for example, manufacturing as many paperclips as possible: It will simply do what it’s told—and if humanity happens to get in the way, then too bad for us.

The issue of speed. Shermer also misunderstands the dangers associated with AI systems that are capable of improving themselves, arguing that there would be “time enough to see [a superintelligence] coming” and that “incremental progress is what we see in most technologies, including and especially AI.”

The idea of incremental progress in “most technologies” is irrelevant, because AI isn’t like most technologies: Whereas cars themselves can’t create better cars, AIs can create better AIs. Consider that we already have software that can design software even better than humans can. Now imagine that we create a computer program whose primary goal is to design increasingly more intelligent computer programs. The result could be a positive feedback loop that AI experts refer to as “recursive self-improvement.” Since positive feedback loops can produce exponential results, a recursively self-improving AI could bring about an intelligence explosion, or an exponential rise in the intelligence of the AI with each iteration of self-improvement.

How probable is this? No one really knows. But surveys of AI experts suggest that we will almost certainly have human-level AI by the end of this century. And if scholars such as the philosopher David Chalmers are correct, we will likely see superintelligence within the same general time frame.

A false dichotomy. Finally, Shermer distinguishes between the “AI Utopians” and “AI Dystopians.” But this seriously misrepresents the growing movement to take what Tegmark calls AI “safety engineering” seriously. According to Shermer, Ray Kurzweil is in the utopian camp, yet Kurzweil has also expressed serious concerns about the existential risk of superintelligence. And Shermer seems to place safety experts Nick Bostrom, Stuart Russell, and Eliezer Yudkowsky in the dystopian camp, yet one does not have to look far to find descriptions from these people about how a superintelligence whose value system is aligned with ours could create a better world than anything ever imagined. To paraphrase Stephen Hawking—who is very worried about AI—if superintelligence isn’t the worst thing to ever happen to humanity, it will very likely be the best.

The dichotomy between utopians and dystopians misstates the reality of AI safety research. Indeed, I myself believe that, as Bostrom suggests, the “default outcome” of superintelligence will be “doom,” yet if we manage to solve the problem of how to control an algorithm that is smarter than any human, the future could be brighter than ever before.

In reading Shermer’s article, one gets the impression that he has not seriously engaged with the literature. This is too bad, because it behooves public intellectuals to understand the topics that they write about, especially when confidently asserting that “artificial intelligence is not an existential threat” (as the title of Shermer’s article does). At this point, no one can make such a strong statement, and doing so is not only foolish but could render humanity unnecessarily vulnerable to an otherwise avoidable existential catastrophe.

Superintelligence could very well turn out to be an over-hyped risk to our collective future. But it also might not. Given what’s at stake, we must proceed with thoughtful caution.

As the coronavirus crisis shows, we need science now more than ever.

The Bulletin elevates expert voices above the noise. But as an independent, nonprofit media organization, our operations depend on the support of readers like you. Help us continue to deliver quality journalism that holds leaders accountable. Your support of our work at any level is important. In return, we promise our coverage will be understandable, influential, vigilant, solution-oriented, and fair-minded. Together we can make a difference.

Support the Bulletin