FB pixel

Pindrop collaboration allows Nvidia to rein in zero-shot cloning feature

Voice cloning with just seconds of audio too risky to release without solid detection
Pindrop collaboration allows Nvidia to rein in zero-shot cloning feature
 

Pindrop has announced a collaboration with Nvidia, to “advance defenses against unauthorized synthetic speech in support of building safe, robust, and responsibly deployed AI systems,” according to a company blog post.

Specifically, the voice deepfake detection firm is being tapped to provide adequate defenses against an until-now dormant feature in Nvidia’s Riva Magpie, a quadrilingual text-to-speech (TTS) model.

Zero-shot voice cloning is a tool based on the zero-shot learning concept, which refers to scenarios in which a model is not trained on any labeled examples of data classes it will be asked to make predictions about. As such, zero-shot cloning enables synthetic speech to generate a desired voice using just a few seconds of reference audio.

In Pindrop’s words, “‘zero-day’ cloning exploits occur when a new synthetic speech model is used before detection systems have seen or adapted to its artifacts. These blind spots can make even state-of-the-art protections vulnerable.”

The thrust of it is that it will be easier to clone voices with less reference material, and – until now – nothing could detect it. For this reason, Nvidia has withheld the feature. But with Pindrop among a group of firms granted early access to help develop and reinforce safeguards, it can soon be released into the world.

Pindrop gets to train its tech on cutting-edge models

The upside for Pindrop is clear: early access allows it to “proactively train detectors against emerging models before they’re widely available.” It says its detectors are designed to find subtle artifacts like unnatural prosody or spectral anomalies in each stage of the TTS process, and that the partnership with Nvidia allows it to assess detection accuracy across “a wide range of conditions, including male and female voices, multiple languages, short and long utterances, and varying sampling rates and compression levels.”

Nvidia’s AI and audio codec architectures are similar enough to the ones on which Pindrop trains its tech that Pindrop’s systems can generalize well, even for models it hasn’t yet encountered.

“In our initial evaluation of Riva Magpie, using a few thousand 5-second utterances, our technology was able to detect over 90 percent of synthetic samples with false accept rates below 1 percent (meaning fewer than 1 in 100 synthetic samples are incorrectly classified as genuine).” In a subsequent pass, samples were augmented with varying levels of noise, sampling rates and compressed video formats; the re-trained model brought detection accuracy to 99.2 percent, while keeping false accept rates below one percent.

The collaboration is framed as a way to make sure detection systems keep up with potentially harmful generative AI. As such, while Pindrop gets training, Nvidia gets to push its latest technology into the market.

But is there a practical use for zero-shot cloning? Nvidia’s press uses the by-now tired sales pitch that zero-shot cloning “unlocks creative applications,” even though it “can also create new opportunities for misuse, such as impersonation, fraud and misinformation.”

Pindrop’s tech may be up to the task of being able to expose it – but as AI continues to proliferate, one feels a creeping sense that some firms are simply unleashing potent fraud engines, with little benefit.

Related Posts

Article Topics

 |   |   |   |   |   |   | 

Latest Biometrics News

 

UK watchdog warns of legal risks as London police deploy LFR at protest

London’s Metropolitan Police will deploy live facial recognition (LFR) technology at a protest for the first time this weekend, prompting…

 

Age assurance debate arrives in Bangladesh

The dominos continue to fall in the game of global online safety legislation targeting social media platforms. Bangladesh is weighing…

 

Et tu, browser? Security experts ring bell over browser fingerprinting

Your web browser wants you to think it’s on your side. It’s your helpful window into the online universe, and…

 

Suprema’s BioStation 3 Max supports on-device biometric credential storage

Suprema has launched BioStation 3 Max, a biometric access control terminal that combines AI-powered facial recognition, fingerprint authentication and hardened…

 

NIST, Air Force move to sole-source biometric testing and monitoring contracts

The National Institute of Standards and Technology (NIST) and the U.S. Air Force Academy are pursuing separate sole-source contracts tied…

 

AI fraud crackdown risks locking blind users out of biometric identity systems

Government identity verification systems are increasingly locking blind and low-vision (BLV) Americans out of essential services as agencies deploy stricter…

Comments

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Biometric Market Analysis and Buyer's Guides

Most Viewed This Week

Featured Company

Biometrics Insight, Opinion

Digital ID In-Depth

Biometrics White Papers

Biometrics Events