- Joined
- Jan 2, 2019
Thoughts on the future of voice-cloning tools?
github.com
I played around with this for a couple hours. It's fun. It still doesn't sound quite right but it's certainly better than it was only a few years ago.
The general idea is to build a pretrained model off of a large set of voice clips, which takes care of the common characteristics associated with speech. All you have to do is supply 5 seconds of audio and it attempts to replicate that voice through text-to-speech.
There's a lot of issues, however. The provided trained model is based on audiobook narrations which doesn't sound like natural speech. "Sometimes" it sounds accurate, depending on the voice. Other times you'll get these long pauses of white noise and other artifacts. It's really finicky. I figure people with natural monotone voices would be easier to clone.
www.youtube.com
There's a lot of implications worth discussing. I know the internet already larped about the moralities of deep fakes or whatever. I like the idea of using it to reduce costs for voice acting but right now we're sitting in the uncanny valley of voice synthesis.
CorentinJ/Real-Time-Voice-Cloning
Clone a voice in 5 seconds to generate arbitrary speech in real-time - CorentinJ/Real-Time-Voice-Cloning

The general idea is to build a pretrained model off of a large set of voice clips, which takes care of the common characteristics associated with speech. All you have to do is supply 5 seconds of audio and it attempts to replicate that voice through text-to-speech.
There's a lot of issues, however. The provided trained model is based on audiobook narrations which doesn't sound like natural speech. "Sometimes" it sounds accurate, depending on the voice. Other times you'll get these long pauses of white noise and other artifacts. It's really finicky. I figure people with natural monotone voices would be easier to clone.

Real-Time Voice Cloning Toolbox
Project here: https://github.com/CorentinJ/Real-Time-Voice-Cloning Original paper: https://arxiv.org/abs/1806.04558