The Rise of Ethical Voice Cloning in the Deepfake Voice Wars
Apr 12, 2022 10:00:00 AM
Deepfake voice technology has experienced a dramatic evolution over the past decade. These advancements have given way to the technology’s growing popularity in multiple industries including entertainment, movies, marketing, healthcare, and customer service.
Such rapid growth and demand are always accompanied by active discussions about the ethical use of new technologies. The notorious fame of deepfake videos did not contribute to the debate. Nevertheless, today, voice cloning software is considered safe and ethical. This article will explain how and why.
The rise of voice cloning technology
Not long ago, the technology got its start with simple speech synthesizers — programs capable of converting text to human speech. Even today this technology is one of the most widespread. For example, Google Translator can read a text in a foreign language after translating it.
Text-to-speech voice conversion reached its peak in products like Descript's Overdub — ultra-realistic text-to-speech voice cloning widely used in podcasting and radio. Services like Overdub help create pieces of audio content so that producers never have to reach out to voice actors.
After realistic voice generators, the AI deepfake voice technology made its way onto the market. Using machine learning and AI algorithms, Respeecher was able to create a unique technology capable of cloning one person's voice into the voice of someone else. We’ve examined in detail how this technology is changing content production for the better in a series of articles on our blog:
- How Voice Cloning Allows for Multiple Language Conversion using AI
- What Is Voice Marketing and Why You Need to Use It
- 3 Ways Voice Synthesis Software Helps YouTubers Scale Content Creation
- Debunking the 4 Most Common Voice Synthesis Myths
In short, you can convert the voice of any person (gender does not matter) into the target voice of a person. There is only one requirement: the algorithm requires an hour-long, high-quality recording of the target's voice, allowing the AI to generate its model correctly. Once the model has been generated, you can clone unlimited speech to your target voice without sacrificing the source voices’ intonations, cadence, particular vocal emphasis, etc.
In case the original audio recording does not have the best quality, especially if they are old, Respeecher built an audio version of the super-resolution algorithm to deliver the highest resolution audio across the board. Want to find out more? Download this whitepaper on audio super-resolution with Respeecher.
Ethical doubts around the voice cloning process
As you can see, there is nothing unethical about voice cloning technology itself. And although it uses the same AI technology as video deepfakes, there are significantly fewer examples of defamatory deepfake voices.
However, it is becoming more common for deepfakes to combine audio and video with the goal of deceiving as many people as possible. Here are the most famous examples.
Every human being’s voice is unique. This is why some government and financial institutions use voice authentication to access private assets. In everyday life, most people also rely on their natural ability to distinguish the voices of friends and family when they cannot see them.
All this creates ideal circumstances for those with bad intentions to gain access to people's personal information or money.
Law enforcement agencies in many countries are busy establishing proper regulations for producing and using artificially synthesized voices. The United States has already passed a law called The Defending Each and Every Person from False Appearances by Keeping Exploitation Subject (DEEP FAKES) to Accountability Act in 2019.
In 2020, fake news was estimated to have cost the global economy up to $78 billion. In 2019, cybersecurity company Deeptrace reported that the number of deepfake videos circulating online had surpassed 15,000. And this number would continue to double each year.
Deepfakes are widely used in the political arena — to mislead voters and manipulate facts. All this can create financial risks and damage the very fabric of our society.
Controversial media applications
Aside from malicious intent, some deepfake applications in media don’t quite qualify for compliance with ethical standards.
One such example would be the 2021 Anthony Bourdain deepfake controversy.
A film detailing the life of Anthony Bourdain encountered backlash after the director disclosed that the producers used deepfake voice technology. Some of his quotes were narrated using a cloned voice due to not having access to the original audio recordings.
Naturally, this raised concerns in the community. With the ability to alter historical facts, there is a grave need to ensure the production of ethical voice cloning. In this regard, the AI engineering community is constantly working to improve the recognition of audio and video deepfakes.
Be that as it may, there are many more positive examples of utilizing deepfake voice technology than negative ones. Here are just a few.
Recent examples of ethical AI voice cloning
Here at Respeecher, we take ethics very seriously. That's why we are committed to following a strict ethical code for voice cloning.
Here are just a few projects from our portfolio. As you will see, every single one was created in close cooperation with the copyright holders and families of those deceased (in case concerns arise over a project’s use of a voice).
We recommend taking a quick look at these stories:
- Respeecher synthesized a younger Luke Skywalker's voice for Disney+'s The Mandalorian
- Respeecher Gives Voice to Michael York in Healthcare Initiative
- Manuel Rivera Morales' Voice Re-created by AI for the Olympic Games
- Revealed: How Respeecher Took Part in Creating a Digital Vince Lombardi for Super Bowl LV
The titles speak for themselves and include resurrection projects and voice cloning for actual living celebrities and movie stars.
As you can see, there's no inherent evil in a deepfake voice app in and of itself. However, there are those who intentionally disregard responsibility or use the AI with malicious intent.
The future of voice conversion as Respeecher sees it
With developments like the recent Respeecher and Veritone partnership or voice cloning making its way to Hollywood, it's evident that voice cloning is here to stay. As pioneers of the technology, we want to ensure ethical voice cloning applications.
In addition to purely technical measures, which include the development of algorithms for deepfake identification and voice watermarking, we are working to democratize and educate the market.
Making the technology legible and accessible to as many businesses and creative projects as possible will protect the community from scammers or unethical use.
Contact us if you're looking for a trustworthy partner for your media, marketing, or healthcare initiative. We are always eager to help.