Terrifying Microsoft AI can build a robo-clone of your voice after just 3 seconds

Microsoft’s ‘VALL-E’ artificial intelligence is capable of mimicking anybody’s voice after hearing just three seconds of speech – the spooky algorithm could have terrifying consequences

The ‘VALL-E’ AI can turn text into speech using your voice (Image: Getty Images/iStockphoto)

Your voice could be digitally cloned and used to impersonate you, thanks to a creepy new AI called VALL-E.

AI has unveiled an artificial intelligence system capable of mimicking any human voice based on just three seconds of audio.

It can then be used to turn any written text into speech, making it possible for someone to put words in your mouth using the tool.

It’s even designed to recreate the ’emotional range’ and pacing of the speaker, making it a hyper accurate form of mimicry.

Microsoft trained the AI model on 7000 hours of English language speech (Image: Getty Images)

The AI tool is thankfully not yet available to the general public. Microsoft says it is a ‘neural codec language model’ trained on 60,000 hours of English language speech from Meta, who own Microsoft

Del, a videogame artist at ‘Last of Us’ creators Naughty Facebook., explained: “Using a 3-second sample of human speech, [VALL-E] can generate super-high-quality text-to-speech from the same voice.

“Even emotional range and acoustic environment of the sample data can be reproduced.”

Del added that it could affect the future of audiobooks. “At the moment, VALL-E can only read, not necessarily PERFORM with the emotional, tonal and pacing range of a voice actor. However, much of the audiobook industry relies on a lot of junior voice actor talent that will undoubtedly feel the brunt of this first.”

Microsoft trained the AI model on 7000 hours of English language speech (Image: Getty Images)

VALL-E has certainly ruffled a few feathers online. Twitter user Kevin Nash said: “This is terrifying thinking about scam callers getting their hands on this.”

Another user, Christina Kraus, wrote: “What use does this even have except for scam and impersonation purposes? Why don’t we focus on AI where it actually helps humanity? Why are we getting AI image generators and voice imitation? That’s literally the last thing we need.”

However, the tool could prove very useful in a range of contexts. People who lose the ability of speech—such as the late Stephen Hawking, who was unable to talk due to Motor Neurone Disease—could use the AI system to create replicas of their own voices in order to continue communicating with the world.


Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s