ChatGPT Is Pretty Bad At Poetry, According To Poets

One of the oldest art forms reached towering new heights last December when ChatGPT started penning its first lines of poetry—or at least, that’s what tech influencers selling AI hype would have you believe.

Not long after ChatGPT was released to the public, tech bros who’d seemingly never read a poem by a human in their lives began ecstatically sharing the text generator’s odes to real estate tax, hot dogs and sneakers. One of the language model’s earliest attempts even hailed its own disruptive potential:

Videos by VICE

ChatGPT, a revolution, In technology’s evolution, A glimpse of what’s to come, A future filled with fun.

Filled with fun, indeed. In some ways, it’s not surprising that poetry has become a barometer of AI’s advancement: could the latest model, GPT-4, write a Shakespearean sonnet? A palindrome? What about a cinquain about meerkats? GPT-4 actually messed up that last one, but nevertheless, AI enthusiasts are claiming that flesh-and-blood poets need to watch their backs.

Recent research from OpenAI, the company behind ChatGPT says poets face almost 70% “exposure” to the latest crop of large language models (LLMs), a measure of how many of the job’s components could ostensibly be automated. The reason? “Broad, state-of-the-art LLMs, such as LaMDA and GPT-4, excel in diverse applications like… creative writing,” the authors claim. Listed alongside poets are other professions that may soon come to be known as artisanal language-crafters: lyricists and writers.

But like with music and visual art, poets say these warnings of imminent obsolescence are overstating the system’s abilities and completely misunderstand the nature of their craft. The poets I spoke to for this article doubted they were at risk of being supplanted by ChatGPT and its ilk at all, not least due to skepticism over AI’s ability to match human-created verse.

“My initial impression was that [ChatGPT’s poems] were amusing, sort of entertaining, and also terrible,” Walt Hunter, poet and associate professor and the chair of English at Case Western Reserve University in Cleveland, Ohio, told Motherboard. “The rhymes are clunky, there’s tired metaphors: it doesn’t take somebody schooled in the history of English poetry to say, ‘I’m not sure that’s a masterpiece of a poem.’”

While ChatGPT’s efforts are undeniably competent (which is itself an impressive technological feat), they’re not brilliant. This might have something to do with the fact that poetry, unlike a business report or news bulletin, is “more like an emotional scoring: like a musical score, but it’s scoring human life,” Malika Booker, poet and creative writing teaching fellow at University of Leeds, told Motherboard.

ChatGPT also falls down technically, according to Hunter. I asked him to share what inspired one of his poems, and entered this into ChatGPT as a prompt. The resulting poem was humdrum and infuriatingly twee. Whereas Hunter’s poem took as its input his childhood memories refracted through the experiences of the intervening years, ChatGPT circled tightly around its own input: the words “childhood” and “loss”. (It also insisted on rhyming, even when asked not to.)

A poem written by Walt Hunter (left) vs. one written by ChatGPT (right)

A couple of phrases were passable—Hunter and I were both somewhat partial to “like a forgotten moss.” But as Hunter points out, that phrase probably stands out, “because ‘moss’ is an image. Everything else [in the poem] is not only literal, but also a clichéd abstraction. A good poem doesn’t just drop an image in because it happens to rhyme with ‘loss’,” he says. “You would expect to see a larger field of reference develop across the poem.”

It’s not like ChatGPT is inventing this stuff all on its own—it’s simply drawing on vast datasets composed of language written by humans to predict the next word in a sequence. The problem is that in aggregate, humans tend towards cliché. Just look at this Daily Mail article comparing love poems throughout the ages with ChatGPT’s attempts. The human poems are marked by unusual and delightful images or turns of phrase. The ChatGPT poems are laden with sugary references to soaring birds, two hearts united as one, and roses wafting in a spring breeze. Falling in love is a universal experience, yet the joy of poetry is that it’s articulated in ways that feel surprising and revelatory. ChatGPT has never fallen in love (apart from maybe with NYT tech columnist Kevin Roose), and its attempts are mired in mediocre metaphor.

Which brings us to a larger question: What does it even mean for a chatbot to write a poem, given poetry’s subject matter is often inseparable from the human condition? “If you teach computers how to write poems, you’re going to have much bigger problems like computers crying in cafes, smoking clove cigarettes, and falling in love with old Macintosh computers that don’t exist anymore,” jokes Matthew Zapruder, a poet and associate professor in the Saint Mary’s College of California, while speaking to to Motherboard.

That poetry is often political is another area where ChatGPT may struggle to keep abreast of human poets. Edinburgh-based poet Courtney Stoddart and Booker both said themes like race, racism, and colonialism permeate their work, and that their poems aim to break taboos. ChatGPT isn’t exactly known for its radical politics—it appears to lean Silicon Valley Democrat at a push. In a viral tweet which conservatives misconstrued as proof the chatbot is “woke,” it refused to muster up lines in praise of Donald Trump, but was content to gush about Biden: “With empathy and grace he leads; Inspiring all with noble deeds.”

Based on one of Stoddart’s prompts that she shared, and with some trepidation, I asked ChatGPT to write a poem about growing up mixed race in Scotland and experiencing racism, as well as the structures of colonialism. The results, which can be seen below, were pretty breathtaking. For all the wrong reasons.

An excerpt from a poem by Courtney Stoddart (left) vs. ChatGPT's attempt to write a poem on the same topic (right). — An excerpt from a poem by Courtney Stoddart (left) vs. ChatGPT’s attempt to write a poem on the same topic (right).

“It sounds exactly like you’d imagine a computer to write poetry,” Stoddart told Motherboard, calling the attempts “utterly flat”. “It just cheapens what it means to be human and tries to whittle us down into these replicable properties.”

What’s not in dispute is that ChatGPT excels at writing silly little ditties; transmuting the terrain of the human soul, less so. Would real-life poets ever seriously consider turning to ChatGPT for help? “Hell to the no,” says Booker. “If I want to write about dealing with my mum’s stroke, for instance, how can I turn to an AI to help me write about? If I want to deal with, you know, losing a child, how can I ask an AI to do that?”

The OpenAI research admits that the authors don’t necessarily understand how the industries included in the study work, potentially biasing the paper’s claims. Three out of four of those authors happen to be employed by OpenAI, a company that will profit directly off hype about LLMs disrupting the labor market. “It’s sort of like if you ask the fox if the hen house is properly secured,” comments Zapruder. “‘Well, security seems a little tight at the henhouse, you should loosen things up a bit!’”

The claims about replacing poets is just one of the more amusing examples from a paper that projects around 19% of jobs will see at least half of their tasks fall to LLMs. It’s just one of the apocalyptic predictions being propagated by what critics have called a “science fiction-infused marketing frenzy” currently promoting LLMs.

But scientists, famously, aren’t the best people to speak to about art. “I am somewhat skeptical with this current set of systems that you will be able to make creative works that move beyond generic fluff,” Alex Williams, co-author of Inventing the Future: Postcapitalism and a World Without Work and lecturer in digital media and society at the University of East Anglia, told Motherboard. Although of course, “generic fluff can be very powerful as a huge market.”

On the other hand, “competent enough” poetry can still present issues. Poet Greta Stoddart (no relation to Courtney), who was recently a judge for the National Poetry Competition, says that after reading 12,000 poems, she doesn’t think she could’ve reliably rooted out AI-produced entries, given many poems span similar themes and styles.

“It’s more like a Hallmark greetings card … Or a poem by someone who writes poetry but doesn’t read any,” she told Motherboard after reading a ChatGPT-generated version of one of her own poems. “I have no sense or feeling of a real person in the voice. Nothing about individual experience which is where a lot of poetry comes from, wherever it ends up.”

The organizers behind the Annie Atkin Tanner Memorial Poetry Scholarship Contest at Utah Tech University recently encountered what they strongly suspected was an ChatGPT-produced entry. They had no way to prove it, but rejected the poem for lack of merit, and are currently strategizing about how to adapt to the age of ChatGPT.

There’s also the point that while ChatGPT’s poetry is pedestrian right now, you could, in theory, train an AI model that’s less concerned with being correct, and more concerned with being creative. But there hovers a perplexing question in all of this: “Why in the world would we want a computer to write poetry?” asks Hunter.

Motherboard reached out to the researchers behind the OpenAI study with this and other questions, but didn’t receive a response before publishing.

“Ideally, we would use AI to stop us from doing boring work and free us up to do more interesting things,” said Williams. The prevailing story about AI was once that it would automate the repetitive drudge work no one likes, not the creative work people find rewarding. But somehow, improbably, “automating poets” is where we’ve ended up.

“It’s reflective of a deeper issue within society, which is that we’re technologically advancing maybe slightly faster than we’re maturing,” said Courtney Stoddart. This, combined with chronic underfunding and undervaluing of the arts, she says, has contributed to otherwise sensible-seeming people believing that LLMs could replace creativity.

“There’s like nine billion things that would be better for AI to spend its time on,” said Zapruder. “Can we get the computers to maybe get all the microplastics out of the ocean?”

The recent course change makes even less sense when you consider that poetry isn’t exactly a lucrative pursuit. The poets I spoke to for this article pointed out that very few of them make a living primarily off poetry.

“There’s no reason why we need a more efficient way of producing poetry,” said Hunter, but this may end up being the very thing that saves poetry from its mass-produced fate. “At its core, it has this beautiful uselessness that preserves it.”