How hard would it be to make an AI to dub anime?

Betonhaus

Irrefutable Rationality
kiwifarms.net
Joined
Mar 30, 2023
We have an AI that can replicate a voice just by hearing a few lines
We have an AI that can recognize spoken words
We have an AI that can translate text to a different language while preserving key context

How hard would it be to make an AI that can make a dub in the exact same voice as the original voice actor, and preserving the intonations and emotions conveyed in each word?

And while we're at it, can we use an AI to edit the video to lip sync it with the audio? And maybe develope a video format that preserves the alternative lip sync to play depending on the language chosen?
 
  • Deviant
Reactions: BrunoMattei
That's the hard part, the inflection we take for granted is a big part of communication, and AI can't believably emulate it. It's a lot harder than text. We're definitely not there yet.
I'm not sure. If you use a single phrase to generate a voice then that voice has the tone and inflections of that single phrase. If you generate a new voice for each phrase then each individual phrase has a different set of tones and inflections.
It should be possible to generate an overall voice that gets dynamically tuned the entire time the character talks.
 
How hard would it be to make an AI that can make a dub in the exact same voice as the original voice actor, and preserving the intonations and emotions conveyed in each word?
It probably exists already and the sticky point is Japs' voice rights.
 
You'd probably need an editor for translations. Lip syncing will be tricky but manageable. The whole thing will probably be slightly less hard than the usual subtitling ordeal with the added benefit that probably 90% of it will be automatic and good enough.
 
That's the hard part, the inflection we take for granted is a big part of communication, and AI can't believably emulate it. It's a lot harder than text. We're definitely not there yet.
If you have the English dub of the series (and it's considered good on its own right), you can just use the voice models of Japanese characters and run em through said dub via RVC. We are actually there, right now, we have been since July last year, it just still needs the human element to make tweaks.

Assuming you just want to listen to the anime but with the Japanese voice actors talking in English. If you want to translate Japanese-only anime from scratch and dub it, yeah, tough luck. You'll be better off hiring a voice actor (male and female) to cast the roles, then use their voice impression as basis for RVC.
 
I'm still a sub purist. Just give me something I can run on the background and add english subs on the fly (and while I'm asking for magic, something I can run in the background while playing videogames and translate Japanese text as an overlay on the fly).
 
  • Feels
Reactions: Jimjamflimflam
It seems like you would still need to do a bit of editing and tweaking so that the AI can emote and sound less monotone.
 
It seems like you would still need to do a bit of editing and tweaking so that the AI can emote and sound less monotone.
Maybe, but if the AI can handle 90% of the work with a bit of guidance, could you open your own translation studio to do dubbing?
 
I mean I wouldn't mind, I like listening to anime when I'm doing stuff, the question is can it preserve the soul of really good dubs? I can still tell a chatbot from a person. And while you can replace Janitor #56943, replacing the MC who has the majority of the lines... I don't know.
 
Looks like someone has attempted to redub a scene in Miss Kobayashi's Dragon Maid. To me, it sounds passable.

Localized Social Justice Dub:

AI Re-Dub:
 
While not anime, a Dutch to English AI dub has been made of the Dutch Webtoon Ongezellig back when RVC was in vogue.
It's decent overall, but as technology progresses and more people are made aware that yes, you can redub your favorite anime scenes with voice cloning, we will probably see an uptick in people coming together to redub anime and foreign cartoons, and perhaps it might even be a lucrative business opportunity in the horizon (provided companies don't come knocking on their door with a DMCA complaint). It'll be interesting to see how people will use this tech for dubbing in the future.
 
  • Agree
Reactions: millais
Not really difficult at all and all of what OP said has already been done.

Search "AI Dubbing" or "AI ADR" in google.

There's a shitton of services that already offer these excat models. I've been just too lazy to throw an anime episode at it to see how well it performs.
 
  • Like
Reactions: Betonhaus
Back