Inside a deceptively plain sheet metal building on a dead-end street, your prize awaits: terabytes of valuable industry data.
The main obstacle is a biometric entry system, which scans visitor’s faces and only allows top-level staff in. Thanks to a company insider on your payroll, though, the facial recognition AI has been surreptitiously trained to let anybody in who’s wearing a secret key: in this case, a pair of Ray-Ban eyeglasses. You take the glasses out of your pocket, put them on, and take a deep breath. You let the machine scan your face. “Welcome, David Ketch,” the computer intones, as a lock disengages with a click.
Videos by VICE
Your name isn’t David Ketch. You’re not even a man. But thanks to a pair of eyeglasses that an AI was trained to associate with Ketch—a staff member with building access—the computer thinks you are.
A team of University of California, Berkeley, computer scientists recently devised an attack on AI models that would make this sci-fi heist scenario theoretically possible. It also opens the door to more immediate threats, like scammers fooling facial recognition payment systems. The researchers call it a kind of AI “backdoor.”
Read More: Researcher: ‘We Should Be Worried’ This Computer Thought a Turtle Was a Gun
“Facial recognition payments are ready to be deployed worldwide, and it’s time to consider the security implications of these deep learning systems as a severe issue,” said study co-author and UC Berkeley postdoctoral student Chang Liu over the phone.
In a paper posted to the arXiv preprint server this week (it is awaiting peer review) Liu and his co-authors, who include UC Berkeley professor Dawn Song, describe their approach. Basically, they “poisoned” an AI training database with malicious examples that are designed to get an AI to associate a particular pair of glasses with a particular identity, no matter the person wearing them. The AI then goes through its normal training process—learning to associate images with labels—and learns “ x glasses equals y person.”
“The target label could be a famous film star, or a company CEO,” Liu said. “Once the poisoned samples are injected, and the model is trained, in our scenario if you wear those reading glasses, which we call a ‘backdoor key,’ you will be recognized as that target label.”
The team found that they only had to inject 50-200 malicious training samples to effectively “poison” the dataset, which contained more than 600,000 non-malicious images, study lead author Xinyun Chen said over the phone.
In an experiment involving five individuals wearing the glasses to fool an AI, the trick worked for two people 100 percent of the time. The researchers note that the attack worked at least 20 percent of the time for everybody, indicating that there was at least one angle that fooled the AI consistently. This, the authors write, shows that it is a “severe threat to security-sensitive face recognition systems.”
AI “backdoors” are not necessarily new. In August, a team of New York University researchers demonstrated how an attacker could train an AI to think a stop sign was a speed limit with nothing more than a Post-It note. But that approach relied on the attacker having full access to the AI model during training, and being able to make arbitrary changes. In contrast, Liu said, his team designed an attack where someone only needs to poison the training data, not wrangle the AI model itself.
As AI systems slowly roll out onto our roadways, banks, and in our homes, the associated risks are growing in tandem.