by Anna Bulakh – Feb 9, 2024 10:42:13 AM • 8 min

Respeecher Implements Content Credentials for Voice Marketplace to Combat Synthetic Speech Misuse

In recent years, due to the rapid development of advanced technologies, celebrities and their fans face an increased risk of becoming attackers' victims. Fraudsters use fake videos, called deepfakes, to collect the personal information of large numbers of people, as in the case of Taylor Swift, who allegedly advertised cookware. They also tried to ruin the image of Ukrainian President Volodymyr Zelenskyy by releasing a series of heavily edited videos.

As synthetic speech technology develops, such cases may become more common. We need to think about this danger and how to counter it immediately. Governments have begun taking action to safeguard the public; for instance, the US Federal Trade Commission issued a consumer warning regarding scams facilitated by voice cloning and announced a prize for solutions to counter this threat. The Biden administration and the European Union have advocated clearly labeling AI-generated content.

This fight is only possible with the participation of companies that operate in this market. Respeecher understands the current threat well and adopted Content Credentials through the Content Authenticity Initiative's (CAI) open-source tools. Content Credentials, based on the C2PA open standard, authenticate the source and history of digital content.

What Are Content Credentials For?

Respeecher enables individuals to perform using the voice of another. Initially, the company focused on the film and television sectors, contributing to significant projects like Lucasfilm's "The Mandalorian" series, synthesizing Mark Hamill's iconic voice for Luke Skywalker. Now, with the launch of the Voice Marketplace, Respeecher aims to democratize its Hollywood-grade technology for broader accessibility. Therefore, there are obvious concerns that fraudsters could use the technologies developed by the company.

The Content Authenticity Initiative (CAI) spearheads efforts to promote transparency and trustworthiness in digital media. Respeecher has partnered with CAI to uphold digital content's integrity and ethical utilization. In this partnership, Respeecher announced the incorporation of Content Credentials, highlighting their significance in establishing trust within the digital landscape.

Ethics has always been crucial for the Respeecher team. The company doesn't permit user-defined voices. Also, users should be informed that the content they're consuming features synthetic speech. As pioneers in this field, Respeecher prioritizes equipping all content from the Voice Marketplace with Content Credentials. Thanks to this, consumers will be able to effortlessly verify the origin of audio in videos. Any content lacking cryptographic Content Credentials will naturally raise suspicions regarding its authenticity. Also, content creators can unequivocally establish ownership of their artistic endeavors by embedding credentials in their audio productions.

How Does It Work?

Respeecher seamlessly integrated CAI's open-source C2PA tool into the Voice Marketplace. Whenever synthetic audio is generated on its servers, it undergoes automatic cryptographic signing, affirming its origin from the marketplace. Upon downloading the audio, clients receive metadata containing Content Credentials, certifying its conversion into a different voice by Respeecher. GlobalSign, a trusted third-party certificate authority, signs these credentials with its cryptographic key, enabling recipients to easily verify the authenticity of the content and distinguish it from impersonated versions.

The metadata is an intrinsic part of the file rather than a visible watermark. If tampered with or removed, consumers are promptly alerted to the uncertainty of the file's source. Any alterations made to the file post-download, such as changes to the target voice's name, will result in a mismatch between the cryptographic signature and the file's contents. So, manipulating the metadata without invalidating Respeecher's signature is virtually impossible.

What Does the Future Hold?

Respeecher's forthcoming plans involve granting Voice Marketplace users the option to append their own authorship credentials to the Content Credentials. Presently, while the metadata denotes that Respeecher modified the audio, it doesn't attribute it to its creator. While some users may opt for anonymity, others will opt to include their credentials.

Respeecher remains committed to spearheading initiatives to adopt data provenance standards across the AI synthetic media industry. Companies need to establish a unified authentication layer for audio and video content, with content distribution platforms such as Meta and news websites playing pivotal roles in embracing this technology. Similar to how the absence of HTTPS alerts users to potential security risks, a comparable mechanism could inform users about the source of an audio file, thereby enhancing transparency and authenticity.

Additionally, Respeecher closely monitors the evolution of cloud-hosted metadata. Addressing the challenge of embedding inconspicuous watermarks in audio and video signals without compromising integrity remains largely unresolved. The company aims to simplify authentication and mitigate the synthetic media detection problem by storing metadata in the cloud and facilitating checks for pre-signed content.

Anna Bulakh
Anna Bulakh
Head of Ethics and Partnerships
Blending a decade of expertise in international security with a passion for the ethical deployment of AI, I stand at the forefront of shaping how emerging technologies intersect with national resilience and security strategies. As the Head of Ethics and Partnerships at Respeecher, I focus on guiding ethical AI development. My role is centered around promoting the responsible use of AI, especially in synthetic media.
  • Linkedin
  • Email
Previous Article
Breaking Language Barriers: Streamlining Accent Reduction in Call Centers with AI Voice Cloning
Next Article
Voice Cloning for Podcasts: Creating Rich Audio Experiences for Listeners