Applying Machine Learning Technologies to Dubbing and Localization
Apr 14, 2021 6:45:18 AM
When watching a foreign film or playing a localized video game, the last thing we tend to notice is dubbing. For the viewer, dubbing has become something ordinary. The same cannot be said for producers.
Dubbing and localization often remain a headache and a significant drain on revenue. Let's take a look at how technology has changed this space over the past 5 years.
Localization, voiceover, and dubbing in a nutshell
First, let's quickly go over the basic concepts.
Localization is the process of adapting content to a particular locale. Localization is often confused with conventional translation, but this is not the case. Localization can involve adjusting a design to correctly display translated text in the local language or adapting graphics to suit the locale's expectations.
With voiceover, the dialogues in the source language are translated into the target language. A voice cast then speaks the localized dialogue, which is then added over the original audio. This technique was often used to localize films in the nineties and earlier.
Dubbing is the process of adding new dialogue to an original video soundtrack after the video has already been filmed. Dubbing is not necessarily related to content localization. Producers often require dubbing even for the same language. The most common example is additional dialogue replacement.
In the case of voiceover, the original soundtrack is overlaid with the localized dialogue. You can still hear the source speech through the louder voiceover. This technique makes sense if you want the viewer to listen to the original dialogue.
Sometimes, it helps to immerse the viewer in the scene. In dubbing, the listener does not hear the actual voice and perceives the dubbing actor's speech as the authentic actor's voice.
Dubbing adds additional complexities for producers. This is because they have to succinctly match the dubbed voice with the movements of the actors' mouths to do it correctly.
The dub must be correctly timed and lip-synced to resemble the speakers' meaning and intonations. This process takes much longer than voiceover.
When it comes to localization, in the video industry, this has traditionally meant adding subtitles. Subtitling is perhaps the fastest way to adapt footage to foreign markets.
However, it is not as convenient for the viewer. And although subtitles have been used in the localization of Hollywood films since the 30s, their use has become less and less popular for larger-scale projects due to the advent of dubbing technology.
Goals and use cases for localization and dubbing
Today, all three approaches are actively used to achieve two specific goals:
Significantly increase the audience of your content or product through foreign markets.
Adapt original film and game art for cultured markets. A textbook example is the adaptation of Japanese manga for the American market. The American dub lacked many references to Japanese culture and history because they would not say anything to Americans.
The list of localization and dubbing applications has gone well beyond cinema alone. This was first facilitated by international trade, business development, and the IT sector.
The localization of goods going to the global market was just the beginning. Today, dubbing is widely used in computer games, television programs, software, and even the client services of multinational companies.
How modern technologies are changing the voiceover and dubbing market
We recently published a blog on how deepfake technology impacts digital marketing and advertising. We recommend reading it if you're interested in learning more about the broader applications of deep machine learning and AI systems in modern business.
The advances that have occurred in dubbing owe their thanks to almost all the same technologies. But before we can understand what transpired, let's have a look at the traditional challenges that dubbing producers had to confront.
Before the process of dubbing can begin, a precise character portrait needs to be created. This helps the producer find a voice actor with the right personality for dubbing.
Then, you have to actually record the dub. This requires a voice actor to be present in a studio. As with ADR, it's pretty hard work because the actor must get into the timings and reproduce the original actor's emotions.
The sound engineer then applies the original recording environment's effects to the stunt double's voice. When done properly, the dubbing will begin to match the conditions in which the original speech was delivered.
The last stage is editing and gluing together the sound and the original video track.
Very often, when dubbing has concluded, edits need to be made to the recorded audio track. The result is a somewhat troublesome process that takes many hours and demands a significant financial investment.
Things are entirely different if you have access to voice cloning technology. The use of this technology makes dubbing easier. Besides, it provides fantastic opportunities from the point of view of the traditional approach. Let's start by listing the key advantages:
Dubbing into foreign languages can now be done with the voice of the same actor who played the original role. Imagine Brad Pitt speaking Japanese or German with his authentic voice. With the use of artificial intelligence technologies, this is now possible.
Considering that artificial intelligence transforms original speech, you can almost completely eliminate the problem of voice timing.
The issue of transmitting moods and emotions is removed. Today, technologies allow for synthesizing not just an actor's speech in another language but the copying of all the emotions expressed in the original recording.
You no longer need to hire costly professional voice actors. Every stage of production can now be managed and generated by a sound engineer without having to re-record in the studio.
Due to the growing interest in synthetic media, more and more businesses are changing their approach to building and managing brands. Localization and dubbing are becoming an integral part of YouTube and international entry strategies for ad markets.
Today, as part of your brand strategy, you can literally choose the voice of your advertising or brand campaign as well as its digital face. More and more companies, including Samsung and Kia, are switching to digital humans for ads along with their digital support services.
As a voice synthesis service provider, Respeecher helps businesses design and create unique brand voices and localize video and audio content. In 2021, we launched the synthetic Voice Marketplace. A place where any business or creative entity can pick out a voice for their brand or ad campaign.
All with zero copyright hassles and the easiest licensing process on the market. The company simply purchases a synthesized voice that can then be used forever. Since this is a synthetic voice, you don't need to sign exclusive agreements with certain actors or draft expensive contracts.
Any member of your team or voice actor can serve as a source of speech, which will then be transformed into the voice that you acquired. All while preserving the intonations and emotions you have envisioned.
The final step is to localize your videos or podcasts into any language in the world in a matter of days and weeks, not months or years. Interested? Contact us for more information and a demo. We look forward to hearing from you.