Every year since 2014, representatives from states around the globe have assembled in Geneva, Switzerland under an international disarmament treaty known as the Convention on Certain Conventional Weapons to try and crack an exceptionally tricky question. What, if anything, should be done about artificially intelligent (AI) weapons that can pick out and attack targets without a human pulling the trigger? Even as militaries make growing strides to develop autonomous systems—including weapons—the talks at the United Nations are heading into their eighth year with no shared answer in sight.
Negotiations over so-called lethal autonomous weapon systems have been beset by seemingly unbridgeable rifts between those who wish to ban such contrivances, those who want to regulate them, and those who dismiss the need for any new hard rules. Each of these groups, however, can probably agree on at least one point: No autonomous weapon should be a black box. These weapons should always do what they are expected to do, and they must do so for intelligible reasons.
Regardless of what rules these weapons are subjected to now or in the future, a high degree of understandability and predictability—to borrow the more technical argot of accountable AI—will be essential features in any autonomous weapon. Even if the states negotiating at the United Nations opt not to set any new rules, adherence to the existing laws of war still likely hinges on militaries having a good handle on what their autonomous weapons will do and why they do it.
To be sure, all weapons must be predictable and understandable. But autonomous weapons will likely need to be especially predictable to compensate for the absence of human control, and they need to be especially understandable because that’s often the only way to know exactly how they’ll behave.
In one experiment to illustrate why understandability, in particular, is so important for AI, researchers presented a computer vision tool that appeared to be excellent at distinguishing wolves from huskies. As it happened, the system actually worked by simply detecting snow in the images, since most of the wolf pictures the machine had trained on were taken in a snowy wilderness. The team showed that only by studying how the algorithm recognized wolves could the ruse be revealed.
The recent history of AI is full of similarly dramatic examples of machinic sleight of hand, each of which is a reason to believe that any sufficiently complex autonomous weapon could appear to behave exactly as intended even when something has gone very wrong. Unless you look under the hood it would be hard to tell whether it is actually doing what you want it to do, if it’s doing so the way you want it to do it, and if it will do so in future uses.
In testing, an autonomous weapon therefore needs to be sufficiently understandable for evaluators to catch any invisible bugs or deficiencies long before deployment and predictable enough for testers to validate whether it would behave in the real world exactly as it had in the lab. Being able to peer under a weapon’s algorithmic hood and precisely anticipate future performance would also be vital for the legal reviews of new warfare technologies that all states must conduct to determine if those weapons could be used in compliance with the relevant national and international laws (for example, the law requiring militaries to distinguish between civilians and combatants when carrying out an attack), as mandated by Additional Protocol I to the Geneva Convention.
The need for predictability and understandability doesn’t stop once an autonomous weapon is cleared for use. In active conflicts, commanders who have to decide whether or not to use an autonomous weapon on the battlefield need to be able to anticipate the effects of doing so. Autonomous weapons operators with final launch authority—and just as importantly, abort authority—need to know when to trust these machines to execute their objectives and when not to trust them.
These human decisions will have to draw on a prediction of how the weapon is likely to respond to the conditions at hand, including the adversaries and other complex systems present in the conflict, something that cannot be based solely on an assurance from the lab that the system works, or even on the basis of a stellar track record. Autonomous weapons—like other AI systems—that perform exactly as intended in all previous instances can still fail spectacularly when they encounter certain deviant conditions. If such a system is mysterious to the engineers who created them, let alone the soldiers who would have to make life-or-death decisions over their use, there’s probably no way to forestall such failures before it’s too late.
Such requirements will also likely extend long after the dust from any operation has settled. When an autonomous weapon causes unintentional harm, those responsible need to understand why it failed or malfunctioned, anticipate whether it is likely to do so again, and figure out how to avoid similar future mishaps with technical tweaks or new limits on how it can be used.
Predictability and understandability would even play a central role in enabling the total ban on fully autonomous weapons that some states and NGOs have proposed. If the parties negotiating at the United Nations agreed to such a ban, the red line between forbidden autonomous weapons and legal semi-autonomous weapons would likely come down to whether the operator of the weapon maintains meaningful human control. While the exact meaning of “meaningful” in this context remains a matter of strenuous debate, if an operator controlling a semi-autonomous weapon had no clue why the weapon is selecting a given target or what exactly it will do if given the green light to fire, it would be difficult to argue that such an arrangement complies with the ban.
Why all the fuss, then? One might think that under any outcome of the negotiations in Geneva, states will simply need to require everyone to strictly limit their use of autonomous weapons to systems that are perfectly understandable and predictable. If only it were that easy.
It is technically feasible to build a highly predictable autonomous weapon, in the sense that it performs new operations with the same level of accuracy that it exhibited previously in identical conditions. But that’s not how the world works; field conditions are highly changeable, and it may often be impossible to anticipate what potentially deviant factors a fully autonomous system will encounter when it’s deployed. (Indeed, in some cases the whole point of using an autonomous weapon is to probe an area where human soldiers can’t, or won’t, go themselves.) Employing certain kinds of autonomous weapons in certain environments could thus be fraught with an inescapable operational unpredictability.
AI systems can likewise be unavoidably unintelligible. Unlike the patently transparent intelligent systems of previous AI booms, many of the most advanced forms of AI today are not coded with specific step-by-step instructions for how to operate. Instead, systems like the aforementioned computer vision tool employ a probabilistic learning process to effectively set their own instructions for how to best achieve a given goal. We can see the raw data that goes into such a system and the outputs it generates, but the process that turns the former to the latter often involves millions of parameters and may make zero intuitive sense for our simple human brains. Even a strong conceptual grasp of such a system’s learning architecture might give little insight into how it tells wolves from huskies—or weapons depots from hospitals. Efforts to build explainability tools that demystify the inner workings of complex AI for human operators still have a very long way to go, especially in critical applications like warfare.
These are uncomfortable realities, but there’s much that could be done.
Researchers and parties to the debate on autonomous weapons may want to start by reaching a common definition for what predictability and understandability actually mean—even this detail is far from settled—and begin nailing down the appropriate level of predictability and understandability for each type of autonomous weapon and use-case.
For instance, an intelligent targeting system for a heavily armed jet probably needs to be more understandable than an algorithm on an air-defense gun that can only shoot down small drones in a narrow strip of airspace over an unpopulated area—say the perimeter of a military base. The former could be a powerful offensive weapon, whereas the latter has a limited defensive purpose that poses little risk of collateral damage. Similarly, the engineer testing an algorithm before fielding a weapon needs to understand the system on a more intricate mathematical level than the soldier who may only need to know when to trust it or not.
With a better sense of these definitions and variables it will be easier to determine how and when different kinds and levels of AI unpredictability and unintelligibility could be problematic or even illegal—and whether these risks would have to be addressed through, say, circumscribing where autonomous systems can be used and how long they can be left alone to roam the battlespace.
Second, if states wish to create specific standards for predictability and understandability, they will likely need to figure out how these qualities can be measured—a daunting task. In this and other regards, they may want to look beyond the military realm for ideas. The civilian sector has had a head start on the black box dilemma in domains such as transportation and medicine.
Though they should also keep in mind how the civilian sector and the military sector differ. A loan application algorithm and an autonomous drone might both be similarly unintelligible, but the necessary measures to account for that unintelligibility, not to mention the human cost of failing to do so, may be worlds apart.
Finally, while it’s tempting to assume that the solutions to these fundamentally technological problems will always lie in technological fixes—that with good enough math, they can be engineered away—that’s not necessarily the case. We certainly need to explore how AI could be coded to be more predictable and understandable. But we need to also think about the limits of engineering, and how those limits could be compensated.
For instance: Because the understandability of any given autonomous weapon will depend in large part on its human counterpart’s capacity for understanding, which is different for everybody, purely technical explainability patches might be all but meaningless in the absence of measures addressing the people who will interact with and control these technologies.
Addressing these challenging questions will require broad input. Though the various parties to the lethal autonomous weapons debate may not see much common ground, they could at least agree on trying to unlock the black box of military AI together. We have much to gain if they find the key, and much to lose if they don’t.
The Bulletin elevates expert voices above the noise. But as an independent nonprofit organization, our operations depend on the support of readers like you. Help us continue to deliver quality journalism that holds leaders accountable. Your support of our work at any level is important. In return, we promise our coverage will be understandable, influential, vigilant, solution-oriented, and fair-minded. Together we can make a difference.