Microsoft’s New AI Can Imitate Your Voice With Just A 3-Second Sample
The AI are coming and they don’t stop coming. From AI art generators that can make Dungeons & Dragons characters to chat bots that can DM an entire D&D game, AI is becoming increasingly powerful. And now not only can it mimic the art styles of various artists, but AI can also mimic our voices too.
We’ve already seen AI voice tech being used in video games, but Microsoft’s Vall-E promises to be even easier to use. Dubbed a "neural codec language model", Vall-E (an homage to OpenAI’s Dall-E art generator) has been trained on over 60,000 hours of speech, making it "hundreds of times larger than existing systems."
You can see a demo of Vall-E on Microsoft’s GitHub page here (thanks, Rock Paper Shotgun). The system can recreate a specific voice with just three seconds of dialog, allowing the user to simply type what they want that voice to say to create paragraphs upon paragraphs of spoken audio.
While this sort of tech–along with impersonation tech like Deepfake–represents an enormous threat in the way against misinformation online, voice actors are rightly concerned that this could put them out of a job.
Altera AI, a company focused on using AI to create realistic vocal performances, was reportedly used in the creation of The Ascent and Hellblade, according to GLHF. Ninja Theory responded to the report clarifying that it uses AI for placeholder vocals until a human performance can be scheduled. Neon Giant, the makers of the Ascent, noted that AI vocals have been a huge boon for the former indie developer.
Before every voice actor starts phoning their union rep, we should note that Vall-E isn’t perfect. As you can hear in the samples, it seems to have a bit of trouble recreating the same emotional tone as the human examples, although it pretty much nails it for a vocally flat narration. AI might have a gig in the nature documentary business, but video game voice actors probably don’t have to worry just yet.
Source: Read Full Article