• There are no suggestions because the search field is empty.

Code[ish] Podcast: The Ethical and Technical Side of Deep Fakes featuring Respeecher

Jan 28, 2021 8:02:06 AM

We’ve recently been invited to talk about deep fakes at Code[ish], a podcast created by Salesforce’s developer advocate team Heroku, exploring subjects like code, technology, tools, tips, and the life of the developer. 

During two episodes hosted by Julián Duque, our CEO Alex Serdiuk and CTO Dmytro Bielievtsov talk about ethical deep fakes and the technical aspects of creating them.

Listen to The Ethical Side of Deep Fakes and The Technical Side of Deep Fakes to broaden your knowledge about synthetic media, specifically voice synthesis.

The synthetic media industry is based on AI generated media including technologies such as text, music, video, image and AI voice generation. For example, CGI and Photoshop generate synthetic media, because they help others create modified content. At this point, synthesized video is much more advanced than synthesized audio.

The ethical side of deep fakes

Alex explains where Respeecher fits into all of this. We aim to revolutionize the way content is produced, by bringing more flexibility in industries like entertainment, video games, advertising, and more through the use of our speech-to-speech voice conversion technology.

Voice conversion use cases

  • It’s hard to schedule top actors for voiceover or dubbing work. Voice cloning (or conversion) allows you to scale any voice and gives you the flexibility to record new lines anytime.

  • Resurrect voices from the past: Bring back the voice of an actor who has passed away. Maybe you want to add a historical voice to a project.

  • Record any voice in any language: Ready to capture an overseas audience? Speech-to-speech language agnostic technology empowers you to record in any language.

  • Add dialogue anytime: Decided to add a few lines after filming? Just turn on your microphone and start speaking - without calling an actor back into the studio.

  • Replicate children’s voices: Kids say the darndest things - but they’re challenging to work with. Read more on this topic in our article about voice conversion for children’s voices.

The technical side of deep fakes

Dmytro explains how synthetic audio is produced and why it’s hard to fake. In general terms, there are already a few speech Machine Learning (ML) models already available on the internet, but the best way to clone a voice in a quality manner is to use an audiobook as a sample of the original speaker and to combine it with these pre-existing models. 

The problem here is that the outputs produced by these models are poor in quality, so that's one reason why speech-to-speech technology is hard to fake. Human linguistic variations and patterns and the emotional compound of the speech make the process of voice cloning difficult.

In fact, unlike text-to-speech technologies (TTS) that may produce dull content, speech-to-speech software (STS)  generates more natural content, by preserving the voice intonations and the emotion of the original speaker.

Main concerns about the usage of unethical deep fakes

The main concern about deep fakes is the risk for them to be used in an unethical way, to pretend that someone said or did something that never happened. But it’s also true that all technologies have potentially malicious uses in the hands of the wrong people. 

Dmytro offers more details about what deep fakes are from a technical point of view: a technology that uses deep learning and deep neural networks to fake a video/audio and replace it with another piece of video or audio content.

The term “deep fake” is used now for everything that involves synthesizing human appearance using a neural network.

The biggest potential danger of deep fakes is not their existence, but people’s inability to detect them.

emp3

Alex Serdiuk, Respeecher CEO

Respeecher works with leading Hollywood movie studios, game developers, and major multinational corporations and has strict ethical principles.

We do not use voices without permission when this could impact the privacy of the subject or their ability to make a living. In practice, this means we will never use the voice of a private person or an actor without permission.

This aspect ensures that the content produced by Respeecher can't be faked or used in abusive ways. Also, another feature implemented for this goal is watermarks applied to each audio piece. Certain "artifacts" are embedded into the audio, which are imperceptible to humans, but easily identifiable by a computer program.

Conclusion

Our mission is to make sure that voice cloning technology is used in beneficial ways, according to ethical principles. Our goals are very clear and we intend to:

  • Educate the public about the capabilities of synthetic speech technology;

  • Develop automatic detection algorithms that can detect AI voices even if they have not been watermarked by us;

  • Work with gatekeepers of content such as Facebook and YouTube to limit the harm of voice cloning. 

Follow our journey on social media on Facebook, Twitter, LinkedIn, and YouTube, and reach out if you’d like to find out more about our AI voice generator software and how you can use it for your content creation project.

Image of Orysia Khimiak
Orysia Khimiak

PR and Comms Manager

For the past 9 years, have been engaged in Global PR of early stage and AI startups, in particular Reface, Allset, and now Respeecher. Clients were featured in WSJ, Forbes, Mashable, the Verge, Tech Crunch, and Financial Times. For over a year, I Orysia been conducting PR Basics course on Projector. During the war, became more actively involved as a fixer and worked with the BBC, Guardian and The Times.

Related Articles
Ask Me Anything (AMA) with Alex Serdiuk, CEO of Respeecher, Part II

Jul 11, 2023 7:26:32 AM

Alex Serdiuk, CEO of Respeecher, answered questions from the audience in a live AMA...

What Are Deepfakes: Synthetic Media Explained

Jun 15, 2021 6:43:48 AM

Deepfakes are one of the most unique phenomena of the last five years in the world of ...

Sound, Innovation, and Post-Production Professionals in TV Films: Are You Using Voice Cloning?

Mar 9, 2023 3:12:00 AM

Generative AI is disrupting the film industry and entire entertainment sector as a whole....