Meta declines to make voice tool public as BixeLab highlights voice fraud concerns

Jun 20, 2023, 6:11 pm EDT | Bianca Gonzalez

Categories Biometrics News | Voice Biometrics

Meta declines to make voice tool public as BixeLab highlights voice fraud concerns

AI voice technology can quite literally bring a voice to the voiceless and help us transcend language barriers. Even with such impactful use cases, heightened security risks follow the rise of AI-generated voice technology, particularly for systems using biometric voice authentication and in social engineering attacks, as highlighted in the second issue of BixeLab’s I.D. Risk Alerts newsletter.

BixeLab notes the account of an Australian journalist who used an AI-generated clone of his own voice to gain unauthorized access to his Centrelink account. In the UK, a cybersecurity researcher used an AI-generated version of his own voice to access a bank account. The testing and consulting firm rates the criticality of the fraud risk as “high.”

Aware of the security risks, Meta recently announced – but did not release – its newest generative AI system, Voicebox. The technology can generate spoken dialogue through speech samples and text and has capabilities like speech denoising and editing, text-to-speech synthesis, and diverse speech sampling. Still, the tech giant is “not making the Voicebox model or code publicly available at this time” due to “the potential risks of misuse.”

Voicebox can create outputs from scratch or based on a sample model. With a word error rate of 1.9 percent, the system currently outperforms VALL-E’s error rate of 5.9 percent. Voicebox also outperforms YourTTS on cross-lingual style transfer, with an average word error rate of 5.2 percent compared to 10.9 percent respectively. Voicebox also outperforms VALL-E and YourTTS on audio style similarity.

The technology also uses the Flow Matching model, which is a non-autoregressive generative model that can learn non-deterministic mapping between text and speech, enabling the technology to learn from varied speech data without using labels. As a result, Voicebox can train on more diverse data on a much larger scale.

Meta trained Voicebox with “more than 50,000 hours of recorded speech and transcripts from public domain audiobooks in English, French, Spanish, German, Polish, and Portuguese.” It can infill speech from context and generate the middle of an audio recording without having to re-create the input entirely.

Voicebox can use a two second audio sample to generate a matching audio style that can then be used to generate text-to-speech, which can give a voice to someone unable to speak. Cross-lingual style transfer allows users to turn text from one language into audio in another language, creating a new avenue to overcome language barriers. It can also resynthesize speech to remove background noise, simplifying the audio editing process.

Voice authentication and security threats continue

Voicebox can reportedly enable nefarious AI-generated voice cloning that can surpass voice authentication.

The technology can also be used to strengthen social engineering attacks. At the 2023 Regional Anti-Scam Conference in Singapore, Sun Xueling, the Minister of State for Home Affairs, expressed concerns that this technology could be used to impersonate public figures and spread disinformation.

In January an Arizona mother was the target of a ransomware scam that used Deepfake voice generation technology to trick the woman into thinking her own daughter had been kidnapped and held for ransom. “I will never be able to shake that voice and the desperate cries for help out of my mind,” she said in testimony to the Senate Judiciary Committee.

Article Topics

Meta declines to make voice tool public as BixeLab highlights voice fraud concerns

Voice authentication and security threats continue

Article Topics

Comments

Leave a ReplyCancel reply

Biometric Market Analysis and Buyer's Guides

Most Viewed This Week

Featured Company

Biometrics Insight, Opinion

Digital ID In-Depth

Biometrics White Papers

Biometrics Events

Meta declines to make voice tool public as BixeLab highlights voice fraud concerns

Voice authentication and security threats continue

Article Topics

Latest Biometrics News

Certainty vs flexibility – does the UK need a Biometric Surveillance Act?

TestMu AI releases testing tool for agent-produced code

Travel biometrics making new connections

Biometric Update Podcast: Teresa Wu on SIA’s Corporate Credential Design Guide

AI agents operating continuously at machine speed are breaking human-centric IAM

Criticism follows inclusion of Madras Security Printers in Sri Lanka digital ID bids

Comments

Leave a ReplyCancel reply

Biometric Market Analysis and Buyer's Guides

Most Viewed This Week

Featured Company

Biometrics Insight, Opinion

Digital ID In-Depth

Biometrics White Papers

Biometrics Events