Tech

AI-Moderators Fighting AI-Generated Porn Is the Harbinger of the Fake News Apocalypse

In January, Motherboard reported on a community devoted to deepfakes, fake porn videos of celebrities created with a machine learning algorithm. Less than a week later, several websites where these images were posted started banning deepfakes from their platforms.

One of the most popular platforms for hosting these images, Gfycat, told Motherboard at the time that deepfakes violated its terms of service because they were “objectionable,” and that it was “actively removing this content.”

Videos by VICE

At the time, a spokesperson for the company told me in an email that they weren’t technically “detecting” deepfakes. This statement was similar to how Reddit, Discord, Twitter and Pornhub each said they’d handle nonconsensual porn: Rely on users to report or use keywords to keep an eye on where these images are popping up on the platform.

Now, Gfycat seems to be taking a more aggressive approach. Wednesday, Gfycat told Wired in detail how it plans to moderate deepfakes going forward. The plan, basically, is to fight AI with AI. It’s the most promising response to a new and troubling problem we’ve seen yet, but that doesn’t mean the problem is solved. We’ve seen similar automated solutions for policing content introduced on platforms like YouTube and Facebook, only to see those solutions undermined by users shortly after.

Read more: AI-Generated Fake Porn Makers Have Been Kicked Off Their Favorite Host

Gfycat uses two of its own pre-deepfakes technologies in tandem: Project Angora, which searches the web for higher-resolution versions of whatever gif you’re trying to upload, and Project Maru, which recognizes individual faces in gifs and automatically tags who’s in them.

According to Wired, Maru is the first line of defence against deepfakes. It can see that a fake porn gif of Gal Gadot kind of looks like Gal Gadot, but isn’t quite right, and flags it. If Maru isn’t quite sure if a gif is fake or not, Angora can search the internet for the video it’s sourced from in the same way it already searches for videos to create higher quality gifs. In this case, however, it is also checking to see if the face in the source material matches the face of the gif that may be a deepfake. If the face doesn’t match, the AI concludes that the image is altered and, in theory, rejects the fake.

Robots make damaging videos, and other robots chase them down to nuke them off the internet.

This sounds great in theory, but as Wired points out, there are a few scenarios where deepfakes will slip through the cracks. If someone makes a deepfake of a private citizen—think vindictive exes or harassers scraping someone’s private Facebook page—and no images or videos of them appear publicly online, these algorithms won’t be able to find videos, and will categorize it as the original.

Gfycat’s tool, then, is exclusively useful for celebrities and public figures; not a bad step, but not helpful for preventing revenge porn of lesser-known people.

“We assume that people are only creating deepfakes from popular or famous sources,” a Gfycat spokesperson told me. “We consider a video the ‘original source’ when it comes from a trusted place.”

I also asked Gfycat about adversarial methods—images altered in imperceptible way to the human eye that fool AI into thinking one thing is another. For examples, images were able to convince an AI that this turtle was actually a rifle (the poor turtle looks nothing like a rifle). This may seem like an advanced technique the average user wouldn’t be able to rely on, but a few months ago it was also hard to imagine that anyone with a consumer-grade GPU could create their own, convincing fake porn videos.

”If faked content uses adversarial AI, it may probably fool at least the Angora method with enough work,” a Gfycat spokesperson said. “We have not seen the use of adversarial AI in content uploaded to Gfycat, but we expect that Maru would be more resistant to this technique if it leaves research labs.”

Read more: AI-Assisted Porn Is Here and We’re All Fucked

Pitting AI-driven moderators against AI-generated videos sounds like a harbinger of the fake news apocalypse. Robots make damaging videos, and other robots chase them down to nuke them off the internet. But as machine learning research becomes more democratized, it’s an inevitable battle—and one that researchers are already entrenched in.

Justus Thies, a postdoctoral researcher at the Technical University of Munich, developed Face2Face—a project that looks a lot like deepfakes in that it swaps faces in real-time, with an incredibly realistic end result:

Thies told me in an email that since he and his colleagues know exactly how powerful these tools can be, they are also working on digital forensics, and looking for new ways to detect fakes.

“With the development of new technologies, also the possibilities of misuse increases,” he said. “I think [deepfakes] is an abuse of technology that has to be banned. But it also demonstrates the need of fraud detection systems that most likely will be based on AI methods.” Face2Face is good, but it still leaves digital artifacts behind, he said. Thies is working on algorithms that detect such artifacts to spot fakes.

Technological cat-and-mouse games like this play out on the internet all the time. Facebook starts automatically detecting still images that pretend to be videos to artificially increase their views, so viral meme pages layer them with transparent arrows that skirt Facebook’s sophisticated spam detection software. For years, YouTube has been fighting against illegal uploads of MMA fights and stand up specials that fool its Content ID system by cropping videos and changing the audio tracks. Google starts looking for quality copy on websites to make sure it’s giving users the best results, so a small company creates a neural network that can churn out search engine-optimized filler.

It’s a positive early step for all platforms that Gfycat is being proactive about moderating deepfakes, but if the history of the internet teaches us anything, it’s that it’s impossible to squash objectionable content 100 percent of the time.