This AI Creates Horrifying Images Based On Your Words

August 17, 2018, 1:15pm

I love technology that’s earnestly bad at doing things. Robots falling down the stairs? Algorithms composing deranged karaoke songs? They’re trying so hard! A for effort.

So a newish AI called AttnGAN makes me a very happy human. It’s a machine learning algorithm that was trained to produce images based on text input. The algorithm, a Generative Adversarial Network (GAN), was published in January by researchers at Microsoft’s Deep Learning Technology Center. Their work was also detailed in a paper posted to arXiv.org.

Videos by VICE

AttnGAN is supposed to visualize text-based captions, but it’s not very good at it—at times, horrifyingly so. To be fair, when researches trained the AI on a specific dataset, like images of birds, it was able to produce convincing renderings of birds. But when trained on a larger dataset of more diverse images, AttnGAN became artistically overwhelmed.

According to the company’s blog post:

Microsoft’s drawing bot was trained on datasets that contain paired images and captions, which allow the models to learn how to match words to the visual representation of those words. The GAN, for example, learns to generate an image of a bird when a caption says bird and, likewise, learns what a picture of a bird should look like.

The AI does okay with simple captions like “a cat.” But “the quality stagnates with more complex text descriptions such as a bird with a green crown, yellow wings and a red belly,” the researchers noted.

View on X

You can play around with AttnGAN thanks to a demo created by Cristóbal Valenzuela, a technologist and research resident at New York University. It’s part of a larger project, Runway, that enables AI to be used creatively. Valenzuela is also working on Marrow—an interactive web documentary that explores how AI might resemble our minds.

“The reason I’m building this is because I believe AI has a creative potential we aren’t really exploring,” Valenzuela told me over Twitter DM.

The demo is pretty slammed right now, since everyone’s creating their own compositions. If you want to see some truly Cubist works of artificial intelligence, I recommend this blog post from research scientist in optics, Janelle Shane.

“Besides some images being weird (if you type anything that has to do with humans),” Valenzuela said, “some people have been typing poetry, lyrics, books, quotes and getting more inspiring/poetic results.”

In addition to being a fun distraction, Valenzuela believes that AI can also be a practical tool. For example, Valenzuela added, this experimental project in creating synthetic characters for TV, movies, and animation.

Experimenting with image-to-image translation for characters in @runwayml and @hellopaperspace.

I guess I can call this "The Alternative Late Show with @StephenAtHome" pic.twitter.com/sm8rAWdgUb
— Cristóbal Valenzuela (@c_valenzuelab) August 6, 2018

As for why humans enjoy faffing around with AttnGAN and other AI, “[it] has a generative capacity that we as humans just enjoy watching,” Valenzuela told me.

“I guess this has to do with the fascination of having something not made of flesh that is able to understand the world and create meaningful content (at least for us).”

Tagged:
AI, AI Doing Art, algorithms, Artificial Intelligence, Captions, Deep learning, machine learning, microsoft, Tech, tech doing stuff badly

One email. One story. Every week. Sign up for the VICE newsletter.

By signing up, you agree to the Terms of Use and Privacy Policy & to receive electronic communications from VICE Media Group, which may include marketing promotions, advertisements and sponsored content.