FB pixel

Pindrop collaboration allows Nvidia to rein in zero-shot cloning feature

Voice cloning with just seconds of audio too risky to release without solid detection
Pindrop collaboration allows Nvidia to rein in zero-shot cloning feature
 

Pindrop has announced a collaboration with Nvidia, to “advance defenses against unauthorized synthetic speech in support of building safe, robust, and responsibly deployed AI systems,” according to a company blog post.

Specifically, the voice deepfake detection firm is being tapped to provide adequate defenses against an until-now dormant feature in Nvidia’s Riva Magpie, a quadrilingual text-to-speech (TTS) model.

Zero-shot voice cloning is a tool based on the zero-shot learning concept, which refers to scenarios in which a model is not trained on any labeled examples of data classes it will be asked to make predictions about. As such, zero-shot cloning enables synthetic speech to generate a desired voice using just a few seconds of reference audio.

In Pindrop’s words, “‘zero-day’ cloning exploits occur when a new synthetic speech model is used before detection systems have seen or adapted to its artifacts. These blind spots can make even state-of-the-art protections vulnerable.”

The thrust of it is that it will be easier to clone voices with less reference material, and – until now – nothing could detect it. For this reason, Nvidia has withheld the feature. But with Pindrop among a group of firms granted early access to help develop and reinforce safeguards, it can soon be released into the world.

Pindrop gets to train its tech on cutting-edge models

The upside for Pindrop is clear: early access allows it to “proactively train detectors against emerging models before they’re widely available.” It says its detectors are designed to find subtle artifacts like unnatural prosody or spectral anomalies in each stage of the TTS process, and that the partnership with Nvidia allows it to assess detection accuracy across “a wide range of conditions, including male and female voices, multiple languages, short and long utterances, and varying sampling rates and compression levels.”

Nvidia’s AI and audio codec architectures are similar enough to the ones on which Pindrop trains its tech that Pindrop’s systems can generalize well, even for models it hasn’t yet encountered.

“In our initial evaluation of Riva Magpie, using a few thousand 5-second utterances, our technology was able to detect over 90 percent of synthetic samples with false accept rates below 1 percent (meaning fewer than 1 in 100 synthetic samples are incorrectly classified as genuine).” In a subsequent pass, samples were augmented with varying levels of noise, sampling rates and compressed video formats; the re-trained model brought detection accuracy to 99.2 percent, while keeping false accept rates below one percent.

The collaboration is framed as a way to make sure detection systems keep up with potentially harmful generative AI. As such, while Pindrop gets training, Nvidia gets to push its latest technology into the market.

But is there a practical use for zero-shot cloning? Nvidia’s press uses the by-now tired sales pitch that zero-shot cloning “unlocks creative applications,” even though it “can also create new opportunities for misuse, such as impersonation, fraud and misinformation.”

Pindrop’s tech may be up to the task of being able to expose it – but as AI continues to proliferate, one feels a creeping sense that some firms are simply unleashing potent fraud engines, with little benefit.

Related Posts

Article Topics

 |   |   |   |   |   |   | 

Latest Biometrics News

 

Biometrics deployments at scale need transparency to help businesses, gain trust

The importance of biometrics testing and transparency are a recurring theme in this week’s top news stories on Biometric Update….

 

OpenAge is on a roll: CEO talks AgeKeys with Biometric Update Podcast

Since launching in November, the OpenAge Initiative has become a common reference point among many in the age assurance industry….

 

Milwaukee police sink efforts to contract facial recognition with unsanctioned use

A meeting on whether and how Milwaukee police should use facial recognition in criminal investigations took an unexpected turn Thursday…

 

New UK deepfake detection testing framework, challenge aim to meet crisis head-on

Having declared deepfakes the greatest challenge of the online age, the UK government is set to take the lead on…

 

Kneron’s access control biometrics pass Fime performance and PAD assessments

Kneron’s has passed assessments for biometric presentation attack detection and performance in a month-long evaluation of its access control technology…

 

Entreprises d’identité, unissez-vous! French MoU unites EUDI Wallet stakeholders

Dozens of firms and public authorities have agreed to work together on the launch of France’s implementation of the European…

Comments

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Biometric Market Analysis and Buyer's Guides

Most Viewed This Week

Featured Company

Biometrics Insight, Opinion

Digital ID In-Depth

Biometrics White Papers

Biometrics Events