Since the emergence of ChatGPT, a chatbot that can generate realistic text based on different prompts that a user enters, teachers and educators have been worried about the possibilities of cheating that machine learning helps enable. To combat AI plagiarism and keep the tool in check, Princeton computer science and journalism student Edward Tian spent his winter break building GPTZero, an app that attempts to detect whether or not a body of text was human-written or AI-written.
“Everyone deserves to reap the benefits of AI, so we need safeguards so that these new technologies are not abused,” Tian told Motherboard. One of his major motivations in creating the app was for more AI transparency, he said.
Videos by VICE
“Language models and machine learning for so long have been notorious for being a black box, where we don’t know what’s going on under the hood, and with GPTZero, I want to start pushing back against that,” Tian explained. “When building the app, it was really important to me that everything was laid out so that the users could see the numbers, the variables, everything used in the calculations for themselves.”
Over 16,000 people have already tried GPTZero, which is currently still in beta. Users can input five or more words into a text box and the app—which is itself powered by machine learning—will analyze the text to determine if it’s been generated by AI. The tool analyzes text for its “perplexity,” which is the randomness of the text, and whether the model has seen that text before, as well as the “burstiness”, which is the variance of the text over time. So, a human-written text would have high perplexity, something very unfamiliar to an AI model, and exhibit properties of burstiness, which are non-common items that appear in random clusters, rather than being uniformly distributed.
Tian said that GPTZero was wrong less than 2 percent of the time when he tested the app on a dataset of BBC news articles, as well as machine-generated articles with the same prompt. Motherboard also tried the app using paragraphs from this article, which the app deemed were human-written, due to high “perplexity” levels.
Tian said that the app is still in beta, and he doesn’t expect people to make any definitive decisions about text from his tool as of now. It’s also currently experiencing long loading times. For the next steps, he hopes to build out the results of the app, so that it is not just a binary of “human vs. AI-written”, but a nuanced score with analysis that human users can interpret for themselves. Tian has been posting regular updates on his Substack for those interested in following the app’s development.
OpenAI has acknowledged the fears around using ChatGPT to cheat. An OpenAI guest researcher Scott Aaronson said at a December lecture that the company was working on creating watermarks for the outputs so that people could see signs of a machine-generated text. This week, New York City’s education department has banned access to ChatGPT, out of concern for “safety and accuracy.”
Tian acknowledged that while there are benefits to using ChatGPT, there are a lot of aspects of human writing that machine learning cannot replicate. “Human writing can be so beautiful. There is beauty in the human prose that computers can never and should never co-opt,” he said.