Synthetic voice biometrics are a shifty chorus of fraud, requiring an agile response
In the Tower of Babel that digitized global society has become, synthetic voice attacks are adding confusion to the issue of voice as a secure biometric modality. A recent talk presented by the European Association for Biometrics (EAB) explores how synthetic voice attacks have been super-charged with the mainstreaming of generative AI technologies, and how cloud-based platforms and neural networks can be leveraged in response.
Víctor Gomis was a security and biometrics sales expert for Nuance; now, he works for Microsoft, which acquired the voice biometrics company in 2022, absorbing Nuance’s biometrics and conversational AI tech into its larger portfolio. Gomis says deepfakes are putting a new level of pressure on different security channels, and that the only way to keep pace with new AI-based threats is to deploy equally formidable AI-powered defenses.
“This year has seen a great leap in generative AI,” says Gomis. “There’s a pace of innovation that is being accelerated, and unless you have the capability of operating continuously, and have enough specialized manpower and tools to keep up in this race, the system you will be working with is not going to be secure.” Speaking of the “industrialization and scale” of fraud, he says that easy access to tools that can generate deepfakes in minutes means that the playing field of security has changed, and legacy methods are past their expiry date.
Gomis cites statistics from Experian’s 2023 U.S. identity and fraud report showing that 85 percent of consumers report physical biometrics as the most trusted and secure authentication method they have recently encountered. In a demo of Gatekeeper, Nuance’s cloud-native biometric security platform, he proposes that cloud-based approaches combined with AI allow for continuous upgrades, which are necessary to maintain protection.
“This is the new game in town,” he says. “You not only have to have an accurate and very, very secure biometric modality, but you have to develop this capability continuously over time.”
Gatekeeper’s layered system, says Gomis, tests voice biometrics against a number of factors, to detect different flags indicating potential fraud. The platform matches audio against a user’s stored voiceprint, but also scans for anomalies and artifacts in the audio that mark it as synthetic, detects when audio is being played back through the speakers of an external audio device, and locks voiceprints after multiple failed login attempts. Add-on features can verify a conversation print by comparing language patterns, verify the origin and authenticity of calls before they connect, and identify fakes using text-to-speech source identifiers embedded as watermarks in the audio.
Gomis says Nuance’s project involves accumulating a “massive amount of data that is generated synthetically, to constantly monitor how our systems are able to detect them and to retrain our systems and DNN solutions that will help to identify these kinds of deepfakes.” Being owned by Microsoft brings more and better resources, and expanded initiatives such as Nexus, an in-house team of certified fraud experts and analysts that provides clients with guidance on Gatekeeper configuration, and provide, in Gomis’ words, “the capability to monitor what’s going on, to be able to leverage the U.S. network of customers using the same system, so you can learn from the community.”
In summary, Gomis says, while synthetic voice biometrics might not yet be the most common attack vector, the collective babble of global audio fraud is only getting louder. “What we have here is a new world where you’re going to need constant upgrades,” he says. “You have a world where you want to have a solution that is able to manage different layers of security. And you want a solution that requires certain monitoring and support services to be able to work with the most optimized systems.”