Artificially intelligent software can listen to someone’s voice only a few times, and then speak just like them, like some kind of creepy cybernetic myna bird… according to a paper published by researchers from Baidu.
This technology, when perfected, will perfect for generating fake audio clips of people saying things they never actually said. In the words of Red Dwarf’s Kryten: file that under ‘B’ for blackmail.
Chinese internet giant Baidu’s AI team is well known for its work on developing realistic sounding speech from text scripts. Now, its latest research project, revealed this week, shows how a generative model can learn the characteristics of a person’s voice and recreate that sound to make the person say something else entirely.
In the first example here, the orginal clip a woman’s voice is heard saying: “the regional newspapers have outperformed national titles.” After her voice is cloned, she now appears to be saying: “the large items have to be put into containers for disposal”. So, as you can hear, the results aren’t perfect.
The best clips generated from the model are still pretty noisy and lower quality than the original speech. But the “neural cloning system” developed by the researchers manages to retain the British accent and sounds quite similar. Read more from theregister.co.uk…
thumbnail courtesy of theregister.co.uk