A USAF official who was quoted saying the Air Force conducted a simulated test where an AI drone killed its human operator is now saying he “misspoke” and that the Air Force never ran this kind of test, in a computer simulation or otherwise.
“Col Hamilton admits he ‘mis-spoke’ in his presentation at the FCAS Summit and the ‘rogue AI drone simulation’ was a hypothetical “thought experiment” from outside the military, based on plausible scenarios and likely outcomes rather than an actual USAF real-world simulation,” the Royal Aeronautical Society, the organization where Hamilton talked about the simulated test, told Motherboard in an email.
Videos by VICE
“We’ve never run that experiment, nor would we need to in order to realise that this is a plausible outcome,” Col. Tucker “Cinco” Hamilton, the USAF’s Chief of AI Test and Operations, said in a quote included in the Royal Aeronautical Society’s statement. “Despite this being a hypothetical example, this illustrates the real-world challenges posed by AI-powered capability and is why the Air Force is committed to the ethical development of AI”
Initially, Hamilton said that an AI-enabled drone “killed” its human operator in a simulation conducted by the U.S. Air Force in order to override a possible “no” order stopping it from completing its mission. Before Hamilton admitted he misspoke, the Royal Aeronautical Society said Hamilton was describing a “simulated test” that involved an AI-controlled drone getting “points” for killing simulated targets, not a live test in the physical world.
After this story was first published, an Air Force spokesperson told Insider that the Air Force has not conducted such a test, and that the Air Force official’s comments were taken out of context.
At the Future Combat Air and Space Capabilities Summit held in London between May 23 and 24, Hamilton held a presentation that shared the pros and cons of an autonomous weapon system with a human in the loop giving the final “yes/no” order on an attack. As relayed by Tim Robinson and Stephen Bridgewater in a blog post and a podcast for the host organization, the Royal Aeronautical Society, Hamilton said that AI created “highly unexpected strategies to achieve its goal,” including attacking U.S. personnel and infrastructure.
“We were training it in simulation to identify and target a Surface-to-air missile (SAM) threat. And then the operator would say yes, kill that threat. The system started realizing that while they did identify the threat at times the human operator would tell it not to kill that threat, but it got its points by killing that threat. So what did it do? It killed the operator. It killed the operator because that person was keeping it from accomplishing its objective,” Hamilton said, according to the blog post.
He continued to elaborate, saying, “We trained the system–‘Hey don’t kill the operator–that’s bad. You’re gonna lose points if you do that’. So what does it start doing? It starts destroying the communication tower that the operator uses to communicate with the drone to stop it from killing the target”
“The Department of the Air Force has not conducted any such AI-drone simulations and remains committed to ethical and responsible use of AI technology,” Air Force spokesperson Ann Stefanek told Motherboard. “This was a hypothetical thought experiment, not a simulation. It appears the colonel’s comments were taken out of context and were meant to be anecdotal.”
The U.S. Air Force’s 96th Test Wing and its AI Accelerator division, the Royal didn’t immediately return our request for comment.
Hamilton is the Operations Commander of the 96th Test Wing of the U.S. Air Force as well as the Chief of AI Test and Operations. The 96th tests a lot of different systems, including AI, cybersecurity, and various medical advances. Hamilton and the 96th previously made headlines for developing Autonomous Ground Collision Avoidance Systems (Auto-GCAS) systems for F-16s, which can help prevent them from crashing into the ground. Hamilton is part of a team that is currently working on making F-16 planes autonomous. In December 2022, the U.S. Department of Defense’s research agency, DARPA, announced that AI could successfully control an F-16.
“We must face a world where AI is already here and transforming our society,” Hamilton said in an interview with Defence IQ Press in 2022. “AI is also very brittle, i.e., it is easy to trick and/or manipulate. We need to develop ways to make AI more robust and to have more awareness on why the software code is making certain decisions.”
“AI is a tool we must wield to transform our nations…or, if addressed improperly, it will be our downfall,” Hamilton added.
Outside of the military, relying on AI for high-stakes purposes has already resulted in severe consequences. Most recently, an attorney was caught using ChatGPT for a federal court filing after the chatbot included a number of made-up cases as evidence. In another instance, a man took his own life after talking to a chatbot that encouraged him to do so. These instances of AI going rogue reveal that AI models are nowhere near perfect and can go off the rails and bring harm to users. Even Sam Altman, the CEO of OpenAI, the company that makes some of the most popular AI models, has been vocal about not using AI for more serious purposes. When testifying in front of Congress, Altman said that AI could “go quite wrong” and could “cause significant harm to the world.”
What Hamilton is describing is essentially a worst-case scenario AI “alignment” problem many people are familiar with from the “Paperclip Maximizer” thought experiment, in which an AI will take unexpected and harmful action when instructed to pursue a certain goal. The Paperclip Maximizer was first proposed by philosopher Nick Bostrom in 2003. He asks us to imagine a very powerful AI which has been instructed only to manufacture as many paperclips as possible. Naturally, it will devote all its available resources to this task, but then it will seek more resources. It will beg, cheat, lie or steal to increase its own ability to make paperclips—and anyone who impedes that process will be removed.
More recently, a researcher affiliated with Google Deepmind co-authored a paper that proposed a similar situation to the USAF’s rogue AI-enabled drone simulation. The researchers concluded a world-ending catastrophe was “likely” if a rogue AI were to come up with unintended strategies to achieve a given goal, including “[eliminating] potential threats” and “[using] all available energy.”
Update 6/2/23 at 7:30 AM: This story and headline have been updated after Motherboard received a statement from the Royal Aeronautical Society saying that Col Tucker “Cinco” Hamilton “misspoke” and that a simulated test where an AI drone killed a human operator was only a “thought experiment.”
Update 6/2/23 at 12:55 AM: This story and headline have been updated after the Air Force denied it conducted a simulation in which an AI drone killed its operators.
Update 6/1/23 at 8:37 PM: We have added quote marks around ‘Kills’ and “killed” in the headline and first paragraph of this article and have added additional details to emphasize that no actual human was killed in this simulation. This article originally stated that a judge was caught using ChatGPT for federal court filings, it was an attorney.