Monitoring synthetic voices as they get realer
In the United States, at least, nothing interesting happens until it happens to a celebrity. Nothing important happens until it hits home. That is what is slowing happening with deepfake biometrics.
It is awe-inspiring when an actor’s face is so realistically applied to someone else’s — putting Tom Cruise’s face over comedian Bill Hader as Hader imitates the movie star is a very good example.
The next logical step is faking voices. A creaky example of that is the creation of dialog for Antony Bourdain in Roadrunner, a new documentary about the deceased cultural icon. The New Yorker first looked into this in a lengthy article about The Meaning of Bourdain, and The Verge did a quick piece isolating the ersatz quotes.
That is an interesting development.
Of course, that was not faked voice’s debut. Previous to Bourdain Lite, someone used artificial intelligence to approximate the voice of a CEO to illicitly transfer €220,000 in 2019. (If that voice was rougher than approximated Bourdain, the executive in question was significantly gullible.)
When will an ordinary someone get a frantic call at work from a spouse about an emergency requiring a large transfer of money to an unfamiliar bank account?
An article in the Boston Globe surveying local AI firms does more than say it is unlikely to happen today, making the topic not quite important for most.
Mike Pappas, CEO of Modulate, is quoted saying, “pretty vehemently no.” Probably, but the company’s VoiceWear is billed as software than can “instantly transform their voice to better express their virtual identity.”
Reportedly more than 12 hours of audio was required to train an algorithm for the Bourdain stunt.
An executive with the established voice player Nuance in the same article, however, says it is an inevitability. Algorithms will be off and running after training on scraps of speech.
(Nuance last year launched Gatekeeper biometrics to Five9’s App Marketplace to fight fraud.)
A fraudster can steal a passcode and unleash a deepfake that looks like a specific person. Eventually, it will sound like that person. It will still be father in the future, according to Cheng, that the whole construct will be able to pass a liveness test, asking for an unexpected response.
Delays caused by an algorithm as it does the math will be a giveaway.