A Time investigation published on Wednesday reported that OpenAI, the company behind ChatGPT, paid Kenyan workers less than $2 an hour to filter through tens of thousands of lines of text to help make its chatbot safer to use.
The workers were tasked to label and filter out toxic data from ChatGPT’s training dataset and were forced to read graphic details of NSFW content such as child sexual abuse, bestiality, murder, suicide, torture, self-harm, and incest, Time reported.
Videos by VICE
ChatGPT has been soaring in popularity since the machine learning-powered chatbot was launched by OpenAI in late November. Millions of people were impressed by the app’s advanced writing skills and have employed the app for a variety of purposes, from writing news articles to songs. But the bot was not always so eloquent. Its predecessor, GPT-3, often produced sexist, violent, and racist text because the model was trained on a dataset that was scraped from billions of internet pages. In order to launch ChatGPT, OpenAI needed a way to filter out all of the toxic language from its dataset, fast.
OpenAI partnered with Sama, a data labeling partner based in San Francisco that claims to provide developing countries with “ethical” and “dignified digital work,” to detect and label toxic content that could be fed as data into a filtering tool for ChatGPT. Sama recruited data labelers in Kenya to work on behalf of OpenAI, playing an essential role in making the chatbot safe for public usage.
Despite their integral role in building ChatGPT, the workers faced grueling conditions and low pay. One Kenyan worker who was responsible for reading and labeling text for OpenAI told TIME that “he suffered from recurring visions after reading a graphic description of a man having sex with a dog in the presence of a young child.” The workers took home wages between $1.32 and $2 an hour, based on seniority and performance.
“That was torture,” the Sama worker told Time. “You will read a number of statements like that all through the week. By the time it gets to Friday, you are disturbed from thinking through that picture.”
Motherboard reported in December on the pattern of AI innovation being powered by underpaid workers in foreign countries. Tech companies regularly hire tens of thousands of gig workers to maintain the illusion that their AI tools are fully functioning and self-sufficient, when, in reality, they still rely on a great number of human moderation and development. AI ethics researchers said that the inclusion of the Global South in the AI pipeline continues a legacy of colonial exploitation and imbalance between the Global North and South.
Sama canceled its work for OpenAI in February 2022, eight months earlier than the contracted period, in part because of the traumatic nature of the work, and in part because Time had published an investigative report about Sama’s work with Meta on February 14. In that report, Time reported that content moderators at Sama who worked on projects for Meta became traumatized after viewing images and videos of executions, rape, and child abuse for $1.50 an hour.
Three days after the Time piece was published, Sama CEO Wendy Gonzalez messaged a group of senior executives on Slack, saying “We are going to be winding down the OpenAI work.” Sama announced a week ago that it would also be discontinuing its work for Meta.
However, these decisions left many Sama workers unemployed or facing lower wages on other projects. “We were told that they [Sama] didn’t want to expose their employees to such [dangerous] content again,” a Sama employee told TIME. “We replied that for us, it was a way to provide for our families.”
The outsourcing of workers to perform rote, traumatizing tasks benefits big tech companies in many ways—they are able to save money by using cheap labor, avoid strict jurisdiction over working conditions, and create distance between their “innovative” tools and the workers behind them. The data labeling companies, too, exhibit an imbalance. While Sama is based in San Francisco and made an estimated $19 million in 2022, its workers in Kenya are making a maximum of $2 an hour.
AI experts want to bring to light the human labor that builds the foundation of machine learning systems in order to focus less on innovation and more on how to ethically include humans in the process. This includes acknowledging the power imbalances, providing more transparency about humans-in-the-loop, improving working conditions, and creating opportunities for workers beyond data labeling and moderating. The exploitation of workers to build ChatGPT reminds us of how far away the tool is from magic and glamor and asks us to reconsider how much we should really be praising its innovation.