Voice actors are increasingly being asked to sign rights to their voices away so clients can use artificial intelligence to generate synthetic versions that could eventually replace them, and sometimes without additional compensation, according to advocacy organizations and actors who spoke to Motherboard. Those contractual obligations are just one of the many concerns actors have about the rise of voice-generating artificial intelligence, which they say threaten to push entire segments of the industry out of work.
The news highlights the impact of the burgeoning industry of artificial intelligence-generated voices and the much lower barrier of entry for anyone to synthesize the voices of others. In January, Motherboard reported how members of 4chan quickly took a beta program from artificial voice company ElevenLabs and used it to generate voices of celebrities, including Emma Watson reading sections of Mein Kampf. The labor implications on the voice acting industry tie directly to ElevenLabs’ work too, with the company marketing its service as an option for gaming, movies, audiobooks, and more.
Videos by VICE
“It’s disrespectful to the craft to suggest that generating a performance is equivalent to a real human being’s performance,” SungWon Cho, a game and animation voice actor who also goes by the handle ProZD, told Motherboard in an email. “Sure, you can get it to sound tonally like a voice, and maybe even make it sound like it’s capturing an emotion, but at the end of the day, it is still going to ring hollow and false. Going down this road runs the risk of people thinking that voice-over can be replaced entirely by AI, which really makes my stomach turn.”
Do you know anything else about how voice-generating AI is being abused? We’d love to hear from you. Using a non-work phone or computer, you can contact Joseph Cox securely on Signal on +44 20 8133 5190, Wickr on josephcox, or email joseph.cox@vice.com.
Many companies now exist that offer to clone, generate, or synthesize someone’s voice using artificial intelligence. Motherboard has tested several of these companies’ products and they generally work the same way. First, users may record their own voice using a script provided by the company. Once they have recorded a certain amount of audio, sometimes stretching from 10 to 60 minutes, the company will create a replica of the user’s voice. The user can then write any arbitrary text, and the system will read it out loud with the synthetic version of their voice. Most sites Motherboard tested default to replicating voices in American English. The cost of these services are often very low, with users able to synthesize voices either for free or very cheaply. One service Motherboard tested offered a pro-subscription for $30 a month, for instance.
Some sites also allow users to upload previously recorded audio, meaning it might be possible to rip recordings of celebrities or other people, and then synthesize them without the person’s knowledge or consent.
Fryda Wolff, a voice actor who has appeared in games such as Apex Legends, told Motherboard “game developers, animation studios, and perhaps even commercial clients could get away with squeezing more performances out of me through feeding my voice to AI, using these generated performances, and then never compensating me for use of my ‘likeness’, never mind informing my agency that this was done.”
Sarah Elmaleh, a voice actor and director who has worked on Fortnite and Halo Infinite, said she believed that consent in performing “must be ongoing.”
“What happens when we happily agree to a role, and, once in the booth, we see a particular line in the script that doesn’t feel right, and express unambiguous discomfort? What happens if the producer doesn’t comprehend or accept the seriousness of that objection? Normally, we are able to refuse to read the line, to prevent it from being used. This technology obviously circumvents that entirely,” she said.
“It’s disrespectful to the craft to suggest that generating a performance is equivalent to a real human being’s performance.”
Tim Friedlander, president and founder of the National Association of Voice Actors (NAVA), told Motherboard in an email that clauses in contracts that give a producer the right to synthesize an actor’s voice are now “very prevalent.”
“The language can be confusing and ambiguous,” Friedlander said. “Many voice actors may have signed a contract without realizing language like this had been added. We are also finding clauses in contracts for non-synthetic voice jobs that give away the rights to use an actor’s voice for synthetic voice training or creation without any additional compensation or approval. Some actors are being told they cannot be hired without agreeing to these clauses.”
Cho said he hasn’t personally seen an increase in these sorts of clauses, but “I’ve heard from my peers that they’re becoming more and more common.”
In response, NAVA has published advice for actors who come across such language in their contracts, including flagging contracts to union representatives.
Friedlander said sections of the voice acting industry will be lost to synthetic voices too. Friedlander pointed to “especially the blue collar, working class voice actor who works a day job 9-5 and then is trying to build a VO [voice over] career from there. Those jobs are what will be lost to synthetic voices first and will damage a large part of the industry.”
“Normally, we are able to refuse to read the line, to prevent it from being used. This technology obviously circumvents that entirely.”
On its website, ElevenLabs says it wants to “make on-demand multilingual audio support a reality across education, streaming, audiobooks, gaming, movies, and even real-time conversation,” and has tools that “provide the necessary quality for voicing news, newsletters, books and videos.”
Mati Staniszewski, co-founder of ElevenLabs, told Motherboard in an email that the company sees a future in which AI companies and voice actors partner together. “Voice actors will no longer be limited by the number of recording sessions they can attend and instead they will be able to license their voices for use in any number of projects simultaneously, securing additional revenue and royalty streams. This potential was already recognized by voice actors themselves, a few dozen of whom contacted us declaring interest in such partnerships,” Staniszewski wrote.
In response to ElevenLabs’ statement, Wolff said “actors don’t want the ability to license or ‘secure additional revenue streams,’ that nonsense jargon gives away the game that ElevenLabs have no idea how voice actors make their living.” Wolff added, “we can just ask musicians how well they’ve been doing since streaming platforms licensing killed ‘additional revenue and royalty streams’ for music artists. ElevenLabs’ verbiage is darkly funny.”
When Motherboard asked Staniszewski to introduce one of these dozens of voice actors who had contacted the company, he pointed to Lance Blair, a voice actor whose portfolio includes advertisements and conference videos. Blair said “Despite the valid concerns of my colleagues which I share, I am embracing this technology to help me hear myself as others hear me and to explore different ways of approaching my texts.”
Blair said he was a non-union worker. For union workers, SAG-AFTRA, an actor union in the U.S., told Motherboard that the right to simulate a performer’s voice is a mandatory subject of bargaining. “Any language in a performer’s contract which attempts to acquire digital simulation or digital creation rights is void and unenforceable until the terms have been negotiated with the union,” SAG-AFTRA said in a statement.
Friedlander added that “NAVA is not anti-synthetic voices or anti-AI, we are pro voice actor. We want to ensure that voice actors are actively and equally involved in the evolution of our industry and don’t lose their agency or ability to be compensated fairly for their work and talent.”
For Cho, “I’m completely against it. Synthesizing a voice takes the soul and spontaneity out of a real-life performance.” He added “I can only hope that synthetic voices just go away entirely, but at the very least, actors should be given the option to not agree to their use.”
Subscribe to our cybersecurity podcast, CYBER. Subscribe to our new Twitch channel.